The Source for Java Technology Collaboration
User: Password:



Rémi Forax's Blog

February 2008 Archives


Sorry but i've toasted your pet

Posted by forax on February 10, 2008 at 05:00 PM | Permalink | Comments (7)

Introduction

The idea is simple. I want to create a calendar service, like the server part of Google Calendar. I want to create a server that is able to parse a specific protocol allowing to query calendars and send a response using by example the ical format.
This service can be queried by different languages so i've decided to use a text based protocol, like HTTP.
By example the request to get a calendar named mycalendar as user forax between November, the 6th and November, the 7th will be something like that.

GET mycalendar forax CAL/1.0
password: AjxHKRFkRwxx3j9lM2HMow==
from: Sun Nov  6 08:49:37 1994
to: Mon Nov  7 12:34:31 1994

More formally, requests can be described by the following grammar using the Tatoo EBNF form.
Tatoo EBNF form use sections:

  • section tokens declares terminals and their regex.
  • section blanks declares the regex whom the lexer will not send a terminal to the parser.
  • section productions declares sequences of terminal and non terminal which will be reduced to a non terminal.
? means zero or one, + means at least one and * means zero or more.

tokens:
  service='GET'
  uri= '([^ ])+'
  user= '([a-z][A-Z])+'
  protocol= 'CAL/1.0'
  colon= ':'
  header_key= '([^ :\r\n])+'
  header_value= '[^ \r\n]([^\r\n])+'
  eoln= '(\r)?\n'
 
blanks:
 space= "( |\t)+"

productions:
 start = request+
       ;
 request = firstline 'eoln' header* 'eoln'
         ;
 firstline = 'service' 'uri' 'user' 'protocol'
           ;
 header = 'header_key' 'colon' 'header_value' 'eoln'
        ;

Now Tatoo, a parser generator we have developed is able to create a non-blocking push lexer/parser for that grammar.
By non-blocking, i mean a parser that work on NIO buffers, and since christmas, Tatoo has a companion named banzai, a server using NIO that can embed a Tatoo parser generator.
Great, but how to specify the semantics, i.e how to specify that the password need to be checked, how the response is computed etc.

The semantics

The semantics is specified by creating a class that inherits from ProtocolHandler. A ProtocolHandler is an object called each time a terminal is recognized (shifted) or a production is reduced. Furthermore, it provides an object that own some methods like asyncWrite() to send back data to the client or endRequest() to end the request.

 public class CalendarProtocolHandler extends ProtocolHandler { 
  private String uri;
  private String username;
  private String header_key;
  private String header_value;
  private HashMap headerMap=...
  ...

  public void shift(RuleEnum rule,TerminalEnum terminal) {
    switch(terminal) {
      case uri:
        uri=decode();
        return;
      case user:
        user=decode();
        return;
      case header_key:
        header_key=decode();
        return;
      case header_value:
        header_value=decode();
        return;
    }
  }

  public void reduce(ProductionEnum production) {
    if (production==ProductionEnum.header) {
      headerMap.put(header_key,header_value);
      return;
    }
    if (production==ProductionEnum.request) {
      handle();
      return;
    }
  }
  
  private void handle() {
    try {
      String password=headerMap.get("password");
      boolean passwdOK=verifyPassword(username,password);
      if (!passwdOK) {
        putUnauthorizedResponse(outBuffer);
        analyzer.asyncWrite(outBuffer);
        return;
      }
      ...
      Calendar c=...
      putICalendar(outBuffer,c);
      analyzer.asyncWrite(outBuffer);
    
    } catch(IOException e) {
      analyzer.endRequest(e);
    }
  }
}

Short explanation of the code above: shift(), if a specific terminal is found, decode the buffer to find its value. reduce() if we reduce a header key/value pair, store it into a map, if we readuce a request, call handle(). handle() verify the password and write a response

Banzaï !

I know what you are thinking:
"Ok, it's interresting. But embeding a parser in a webserver will damage performance".
No, Tatoo parser is carefully designed to not hurt performance.
I know that you don't believe me. So let me try to convince you with a stupid benchmark(tm).
I have written a subset of the HTTP/1.1 grammar (a subset because the wole HTTP/1.1 grammar is huge and i'm lenient), generated the corresponding parser with Tatoo, written a ProtocolHandler corresponding to HTTP and beanchmark banzai (the server) with that protocol handler and comparing it with Grizzly and Jetty.

Benchmarks

So i've borrowed two DELLs (config) with ethernet Gigabit cards and a gigabit switch in my labs, plug them and play.
Servers are set up with a Gentoo (2.6.19-gentoo-r5) without major modifications. I have just raised the number of descriptors to 65535.
Because i want to do a stress test, i've used apache bench (ab) as client so this is not a real scenario, just a stupid benchmark(tm).
If you want to reproduce the test, checkout the code here svn checkout https://svnigm.univ-mlv.fr/svn/tatoo/trunk, compile Tatoo using ant.
Banzai is located in a sample directory named httpserver. use ant all compile to compile banzai, and to launch it using

 java -server -cp classes:../../lib/tatoo-runtime.jar fr.umlv.tatoo.samples.httpserver.banzai.Main

First test: how many requests can be handled by banzai

banzai1.png

Serving files of 4k, 8k, 16k and 32k with different numbers of concurrent connections (8, 16, 32 etc.). The value is a mean over 25 runs of 50 000 requests.
First, when i have seen the result of the benchmark I was astonished. Wow, linux 2.6 and a Core Duo offers great perf, more than 25 000 requests by second for a 4k file.
Furthermore, it seems that banzai have a problem if there are lot of concurrent connections. Perhaps because banzai doesn't have a strategy like closing keepalive connections if there are lot of connections.
Now, a more interesting graphs.

Second test: comparing with the others

banzai2.png

Serving a 4k with different concurrent connections
Yes, banzai performs slightly better than grizzly ...
at least if there are not lot of concurrent connections.
About Jetty, I think I screw up the conf because the slope of the curve is weird and I am not able to explain why.

What's next

I think it's possible to integrate non blocking parsers technology directly into grizzly.
Another idea is to wrap banzai to use RESTlet API.
To end, i have found small improvements to the OpenJDK and i will send patches that will benefit to anyone.

Cheers



CICE prototype available and FOSDEM

Posted by forax on February 07, 2008 at 05:48 AM | Permalink | Comments (0)

It's an old news but i've just discovered that Mark Mahieu provide an implementation of CICE closure proposal which is an aternative to BGGA prototype.

By the way, i will be at FOSDEM'08, if you want to meet me, i will try to attend to all Free Java Meetings.

I have decided to finish this entry a la Chris Campbell.
In my ears: Morcheeba, "Dive Deep"
In my eyes: Some cryptic codes like always.





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds