Skip to main content

Sorry but i've toasted your pet

Posted by forax on February 10, 2008 at 5:00 PM PST

Introduction

The idea is simple. I want to create a calendar service, like the server part of Google Calendar. I want to create a server that is able to parse a specific protocol allowing to query calendars and send a response using by example the ical format.
This service can be queried by different languages so i've decided to use a text based protocol, like HTTP.
By example the request to get a calendar named mycalendar as user forax between November, the 6th and November, the 7th will be something like that.

GET mycalendar forax CAL/1.0
password: AjxHKRFkRwxx3j9lM2HMow==
from: Sun Nov  6 08:49:37 1994
to: Mon Nov  7 12:34:31 1994

More formally, requests can be described by the following grammar using the Tatoo EBNF form.
Tatoo EBNF form use sections:

  • section tokens declares terminals and their regex.
  • section blanks declares the regex whom the lexer will not send a terminal to the parser.
  • section productions declares sequences of terminal and non terminal which will be reduced to a non terminal.

? means zero or one, + means at least one and * means zero or more.

 

tokens:
  service='GET'
  uri= '([^ ])+'
  user= '([a-z][A-Z])+'
  protocol= 'CAL/1.0'
  colon= ':'
  header_key= '([^ :\r\n])+'
  header_value= '[^ \r\n]([^\r\n])+'
  eoln= '(\r)?\n'

blanks:
space= "( |\t)+"

productions:
start = request+
       ;
request = firstline 'eoln' header* 'eoln'
         ;
firstline = 'service' 'uri' 'user' 'protocol'
           ;
header = 'header_key' 'colon' 'header_value' 'eoln'
        ;

Now Tatoo, a parser generator we have developed is able to create a non-blocking push lexer/parser for that grammar.
By non-blocking, i mean a parser that work on NIO buffers, and since christmas, Tatoo has a companion named banzai, a server using NIO that can embed a Tatoo parser generator.
Great, but how to specify the semantics, i.e how to specify that the password need to be checked, how the response is computed etc.

The semantics

The semantics is specified by creating a class that inherits from ProtocolHandler. A ProtocolHandler is an object called each time a terminal is recognized (shifted) or a production is reduced. Furthermore, it provides an object that own some methods like asyncWrite() to send back data to the client or endRequest() to end the request.

 public class CalendarProtocolHandler extends ProtocolHandler { 
  private String uri;
  private String username;
  private String header_key;
  private String header_value;
  private HashMap

Short explanation of the code above: shift(), if a specific terminal is found, decode the buffer to find its value. reduce() if we reduce a header key/value pair, store it into a map, if we readuce a request, call handle(). handle() verify the password and write a response

Banzaï !

I know what you are thinking:
"Ok, it's interresting. But embeding a parser in a webserver will damage performance".
No, Tatoo parser is carefully designed to not hurt performance.
I know that you don't believe me. So let me try to convince you with a stupid benchmark(tm).
I have written a subset of the HTTP/1.1 grammar (a subset because the wole HTTP/1.1 grammar is huge and i'm lenient), generated the corresponding parser with Tatoo, written a ProtocolHandler corresponding to HTTP and beanchmark banzai (the server) with that protocol handler and comparing it with Grizzly and Jetty.

Benchmarks

So i've borrowed two DELLs (config) with ethernet Gigabit cards and a gigabit switch in my labs, plug them and play.
Servers are set up with a Gentoo (2.6.19-gentoo-r5) without major modifications. I have just raised the number of descriptors to 65535.
Because i want to do a stress test, i've used apache bench (ab) as client so this is not a real scenario, just a stupid benchmark(tm).
If you want to reproduce the test, checkout the code here svn checkout https://svnigm.univ-mlv.fr/svn/tatoo/trunk, compile Tatoo using ant.
Banzai is located in a sample directory named httpserver. use ant all compile to compile banzai, and to launch it using

 java -server -cp classes:../../lib/tatoo-runtime.jar fr.umlv.tatoo.samples.httpserver.banzai.Main

 

First test: how many requests can be handled by banzai

banzai1.png

Serving files of 4k, 8k, 16k and 32k with different numbers of concurrent connections (8, 16, 32 etc.). The value is a mean over 25 runs of 50 000 requests.
First, when i have seen the result of the benchmark I was astonished. Wow, linux 2.6 and a Core Duo offers great perf, more than 25 000 requests by second for a 4k file.
Furthermore, it seems that banzai have a problem if there are lot of concurrent connections. Perhaps because banzai doesn't have a strategy like closing keepalive connections if there are lot of connections.
Now, a more interesting graphs.

Second test: comparing with the others

banzai2.png

Serving a 4k with different concurrent connections
Yes, banzai performs slightly better than grizzly ...
at least if there are not lot of concurrent connections.
About Jetty, I think I screw up the conf because the slope of the curve is weird and I am not able to explain why.

What's next

I think it's possible to integrate non blocking parsers technology directly into grizzly.
Another idea is to wrap banzai to use RESTlet API.
To end, i have found small improvements to the OpenJDK and i will send patches that will benefit to anyone.

Cheers

Related Topics >>

Comments

Salut, you should post the configuration you used for Jetty and Grizzly. I'm interested to see how many threads are used :-) A+ -- Jeanfrancois

Salut, we have found an issue with 1.7.1 when the bundle is used, which is the one you unfortunately used (significant performance regression). We released 1.7.2 which will probably changes the data (1.7.0 doesn't have the issue). Let us know :-)

Ok, i will test.

Hello. I can not understand why i can't post comment into your blog http://weblogs.java.net/blog/forax/archive/2007/05/java_property_d.html So i will post my comment here. First i have to say thanks for all you done about Java properties. And I would like to continue discussion about properties. I have something to add ))

Yes :-) You might get better performance if you add: -Dcom.sun.grizzly.maxThreads=5 (default was 20, which is a bug :-)). For the kind of load you are doing, it can make a difference :-) A+

Hi jean françois, hi charlie,
As specified on grizzly web site, I haven't written a conf, just type java -server -jar grizzly-http-webserver-1.7.1.jar Is there some hidden parameters ?

+1 on sharing the Grizzly configuration. Would you be interested in participating, contributing and integrating non-blocking parsers technology into Grizzly? We would really like to have your participation. And, I'm sure OpenJDK would be interested in your patches too.