 |
Messaging is degenerate RPC
Posted by cajo on June 25, 2006 at 01:28 PM | Comments (35)
Allow me to start with a small disclaimer: For those who do not already know; I lead the cajo project, where we promote the idea that the internet can be a collection of World Wide Virtual Machines; where remote objects are used just as local objects.
Yet there is a small, but rather vocal community, who wish to espouse that distributed object RPC is somehow flawed, and that 'messaging' is a superior solution. Being a somewhat more mature developer; I remember old tales from my childhood, where 'modern' twentieth century businessmen were admonished:
"Avoid the telephone; the network is unreliable: the post is superior."
Clearly they were wrong then; yet surprisingly this ideology is again being promulgated as fact.
To be sure, synchronous invocations are odious, and they are difficult to scale. For example; I actually prefer email to the telephone, since I am not required to respond immediately. Email does not disrupt my work, and I can deal with it when I am ready. However, it is easy to forget: sending email is still synchronous RPC! No matter how you slice it: data must be sent from one computer to another; over the network.
A brand new class has just been added to the cajo project; in less than 30 lines of code, it clearly illustrates that 'messaging' can be trivially implemented, including queueing, in a synchronous RPC environment:
public final class Queue implements gnu.cajo.invoke.Invoke {
private java.util.LinkedList list;
private final Object object;
public Queue(Object object) { this.object = object; }
public synchronized Object invoke(String method, Object args) {
if (list == null) {
list = new java.util.LinkedList();
new Thread(new Runnable() {
public void run() {
try {
while (list.size() == 0)
synchronized(Queue.this) { Queue.this.wait(); }
String method = (String)list.removeFirst();
Object args = list.removeFirst();
gnu.cajo.invoke.Remote.invoke(object, method, args);
} catch(Exception x) { return(); }
}
}).start();
}
list.add(method);
list.add(args);
notify();
return Boolean.TRUE;
}
}
Here any object reference, either local or remote, can be made remotely invokable, in any JVM. Yet every method invocation on a Queue object is executed asynchronously, on its member object, in a separate thread. If result data is desired, one could simply provide a callback object reference, as one of the method arguments. The boolean true is returned simply to 'guanantee' that the message has been successfully received.
If you want messaging, by all means, go for it! It is a very useful technique. However, it must be seen for what it is; degenerate synchronous RPC: i.e. a tiny subset of the functionality that is possible with distributed objects.
Bookmark blog post: del.icio.us Digg DZone Furl Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
The reverse argument could be made as well: That RPC is a special case of messaging.
I don't see this as a question of can you build one with the other; but rather a question of which one should be the default.
(sarcasm)How many threads do you intend to start in Java? This isn't Erlang after all ;)(/sarcasm)
Oh, perhaps fixing the race condition in the code would make the argument a little more useful. If "list.add(method)" completes, but not yet "list.add(args)", then "list.size() == 0" would be false and pass the while loop in the runnable.
Posted by: jheintz on June 25, 2006 at 02:51 PM
-
Well, there's a few problems with this post:
1) Sending email is most certainly NOT synchronous RPC. Queueing your message is synchronous, as is putting a message on a message queue, but delivery is NOT synchronous. You think your email client doesn't return until your message is delivered to the recipient? Ever received a message back saying that the destination email address is full / unreachable / doesn't exist? That's because the asynchronous delivery couldn't complete.
2) Just because you can implement your remote RPC using queueing doesn't mean that messaging is just a form of RPC. Messaging and RPC have fundamentally different expectations on the part of the caller, and those semantics are the real difference.
3) I didn't look through all of the code you included... I stopped when I saw the "synchronized" keyword on the method where you're re-implementing queueing. Go look at java.util.concurrent, especially the BlockingQueue.
This post doesn't give me wonderful expectations of the scalability of Cajo.
Posted by: jcarreira on June 25, 2006 at 04:12 PM
-
Hi jheintz:
> The reverse argument could be made as well: That RPC is a special case of messaging.
It could, it just depends on how low into the protocol you want to go.
> How many threads do you intend to start in Java?
One, even then, only if one or more methods are invoked.
> Oh, perhaps fixing the race condition in the code would make the argument a little more useful. If "list.add(method)" completes, but not yet "list.add(args)", then "list.size() == 0" would be false and pass the while loop in the runnable.
Very nice catch, perhaps better would been the reverse:
synchronized(Queue.this) {
while(list.size() == 0) Queue.this.wait();
}
Thanks,
John
Posted by: cajo on June 25, 2006 at 06:01 PM
-
jcarreira,
> Sending email is most certainly NOT synchronous RPC.
It seems you have tied sending to delivery in your mind, these are completely independent concepts in messaging. Your email is sent over something called Transmission Control Protocol, and it most definitely is synchronous. It is how your email client knows your email server has definitely received your message.
>Just because you can implement your remote RPC using queueing doesn't mean that messaging is just a form of RPC.
I'm afraid you have missed the point completely. This post was not about implementing RPC, nor was it about queueing. Rather, it was about using RPC to demonstrate its degenerative subset; messaging. In other words; when you have RPC, you trivially can perform messaging, however, the converse is not true.
> I stopped when I saw the "synchronized" keyword on the method where you're re-implementing queueing. Go look at java.util.concurrent, especially the BlockingQueue.
Very few installations even run JRE 1.4, astronomically less 1.5; I suspect you must still be in school, at least I hope.
> This post doesn't give me wonderful expectations of the scalability of Cajo.
One day soon, I genuinely hope you will be pleasantly surprised! ;-)
John
Posted by: cajo on June 25, 2006 at 08:12 PM
-
Hi cajo,
I think the "protocol" is the crux of the argument. What "application protocol" is default and used 95% of the time? Sure, you can write messaging-alike systems on top of RPC and vice-versa (ignoring some overhead), but what is the default?
You write that the cajo project promotes the idea that "remote objects are used just as local objects". If that were true why would I ever need to know about (program to) this API? Doesn't the existence and usefulness of this class imply that a developer would need to know about the different remote/local characteristics of objects?
I think systems that promote writing the code without worrying over all the distributed/concurrent issues are very important (erlang, erights, mozart/oz). Those issues are dealt with somewhere - just not in the code at hand. Trying to do that in Java (with heavy-weight threads and shared mutable state) is a much tougher proposition.
If someone uses cajo and creates a distributed system of components that performs terribly, what should they do (after profiling to find the problem)? Change the code or the cluster distribution? If the answer is use a Queue and change the code, then the messaging folks have one the argument.
Thanks, John
Posted by: jheintz on June 25, 2006 at 08:42 PM
-
No, I'm not in school, but thanks for the condescending tone. I'm actually a software architect building enterprise apps, using both synchronous and asynchronous models, as they fit, and using messaging where it fits best.
Your email analogy is still horribly flawed. So lets say I'm queueing a message using JMS on a remote JMS Queue. In that case, it will be sending the message to the JMS server using RMI over JRMP (or possibly IIOP) on top of TCP/IP... SO WHAT? Does that make it any less of an asynchronous messaging system? Messaging is all about delivery and processing being asynchronous (usually), not about the queueing process. Your email delivery isn't finished until it's delivered to the recipient.
Oh, and yes, you can easily implement RPC using messaging, too. Take a look at creating response queues when sending JMS messages and waiting for a response. In fact, implementing it this way gives you transparency about where your call will be processed and could easily let it fan out across a cluster to whichever machine is free to read your message from the Queue.
On the "can't update to 1.5" crap... lots of people are using 1.5. Check out the fact that EJB3 gets lots of attention and requires 1.5. But that's not even the point. If you want to use the java.util.concurrent code, you can get it separately from where Doug Lea's team maintains it, and it runs all the way back to 1.3, I believe. There's no excuse for being ignorant of Doug Lea's work that makes concurrent programming easier and much, much more efficient.
Posted by: jcarreira on June 25, 2006 at 09:16 PM
-
Hello John, (jheintz)
I don't think of it so much as what is default protocol, rather what is more powerful. Default protocols are inevitably transient to capability. Rather than the current approach, I advocate the stronger one.
When I write of using remote objects as local ones, it means precisely that there is no API. A developer can know if the object his code is using is local, or remote, when he needs to. He can even very simply determine when his code is being invoked by a remote object, vs. a local one, again whenever he wants to.
Actually it is quite easy to use objects without regard to remoteness or locality. My last post was one on this subject exactly.
Of course, anyone can write terribly performing code with cajo too, it is no silver bullet. This queueing and messaging topic was simply to illustrate that it can be a tool, but that it is merely a tiny subset of the options available.
Thanks, also John :)
Posted by: cajo on June 25, 2006 at 09:35 PM
-
jcarreira, (no chance you happen to be John too ;)
> I'm actually a software architect building enterprise apps, using both synchronous and asynchronous models, as they fit, and using messaging where it fits best.
You seem to take my post as an assault on messaging, it is far from it. I actually endorse it as a tool, much as I endorse email over the telephone call. Yet it is but one, abeit simplistic tool. (Actually I even endorse UDP, i.e. unreliable messaging, but let's definitely not go there!)
> So lets say I'm queueing a message using JMS on a remote JMS Queue ... SO WHAT? Does that make it any less of an asynchronous messaging system?
No, you are correct, it is completely asynchronous. As I have stated repeatedly; it is merely acting as a degenerate form of RPC.
> Messaging is all about delivery and processing being asynchronous (usually), not about the queueing process.
OK, this sounds like the key issue here; messaging does not imply delivery, nor processing, nor queueing, nor would it be synchronous: It indicates merely that you successfully sent the message. This is where RPC extends on this concept, surprisingly in many of the ways it seems you assumed messaging did.
> Oh, and yes, you can easily implement RPC using messaging, too.
I really hope the RMI team is not monitoring this thread, they might have a few words to say to you. Actually that would not be easy.
> There's no excuse for being ignorant of Doug Lea's work that makes concurrent programming easier and much, much more efficient.
Perhaps. However in this case I eliminated its need entirely, with only two lines of code, that's pretty efficient.
(At least I think, could Doug do it in one? ;)
Posted by: cajo on June 25, 2006 at 10:54 PM
-
You truly have missed the point, John. First off, the problem with your example is that your application is now holding onto a thread indefinitely. What that means in production is the following:
If the calls take a very long time, threads will by tied up for a very long time. Threads chew resources, so under load your system is not going to perform especially well.
If you application goes down, those threads will be lost and there won't be any way to recover them.
But that's just your example. More than that, the "x is a degenerate case of y" is deliberately antagonistic. The real answer is that you can implement messaging with RPC style calls much as you can implement RPC with messaging style calls, so calling one a degenerate case of the other is absurd.
The usual advantage to messaging is that messaging systems are designed to be asynchronous. RPC systems are typically designed to be synchronous and as a result, they usually mess up the asynchronous implementation. I don't have the time to look at cajo but here are some questions for you (note, where i use the term message(s), in RPC terms, I'm talking about data transfer between two remote end points -- two messages for a synchronous RPC call):
How do you orthogonally monitor traffic between two end points?
How do you do message route messages to multiple end points?
How do you implement any of the EIP patterns using point to point calls in cajo?
How do you manage transactions in cajo?
How do you do XA transactions in cajo?
How do you implement message persistence in cajo?
How do you do load balancing / clustering in cajo?
How do you manage high availability in cajo?
How do you deal with duplicate messages and priorities in cajo?
Messaging has very tidy answers to all of the above questions. Messaging systems that provide all of the above are available right now which means that if you're writing a framework like cajo and you choose to send messages over a bus, all of the work listed above has already been done and tested. i.e. even if you have answers, I'd be more inclined to trust a messaging vendor that probably has thousands more production deployments behind them than cajo does.
Posted by: dkfn on June 26, 2006 at 06:11 AM
-
No, not a John... I'm Jason.
I'm guessing you've never built a system on a message bus architecture, so I'll just quit with that stuff since you're not getting what I'm saying.
On Doug Lea's work, I'm not talking about the lines of code efficiency. Of COURSE adding synchronized to everything is less typing, but it's also NOT CONCURRENT, it serializes every thread through the one synchronized block.
Take a look at Doug's work... especially around non-blocking concurrent data structures and the java.lang.concurrent.atomic classes, which let you do atomic compare-and-swap operations for different types. Non-blocking == near linear scaling. Blocking through the "synchronized" keyword == one thread at a time.
Posted by: jcarreira on June 26, 2006 at 06:34 AM
-
Hi dkfn,
> the problem with your example is that your application is now holding onto a thread indefinitely.
Correct. However it is suspended, except when it has a queue to unload, that's a pretty good use of a thread. Did you know RMI creates a thread per invocation? I have applications that create literally millions of threads, it is no problem, and of course, it also scales beautifully.
> The usual advantage to messaging is that messaging systems are designed to be asynchronous. RPC systems are typically designed to be synchronous
Excellent. I couldn't have said it better myself! My point is that synchronous invocations can be trivially made asynchronous, but asynchronous invocations cannot be made synchronous. This is what I mean by degenerate RPC. (I do not think this word means what you think it means ;)
> and as a result, they usually mess up the asynchronous implementation.
Oh dear.
Posted by: cajo on June 26, 2006 at 06:58 AM
-
Hi Jason, it is nice to make your acquaintance!
I appreciate the spirited dialog, it's what gets people really thinking. To be honest, I don't even consider using messaging anymore.
Also, I certainly do not mean to belittle or detract from Doug's fine work in any way. Concurrency is one of the most difficult aspects of imperitave programming. While it is all nicely fixed with functional programming, that is simply trading one set of challenging abstractions (Mr Turing's) for another (Mr. Gödel's).
I guess my main assertion is that messaging was really great, originally, when that was all there was. Now it seems people overly cling to it, perhaps out of habit, perhaps because it's simple, or maybe out fear of the unknown.
Full feature synchronous RPC is truly an astonishing thing! It is unparalleled in its power and flexibility. I wouldn't use messaging in a single JVM design, yet RPC effectively allows multiple JVMs to interact much like one.
OK, so this blog entry might have sort-of a 'just bring it' tone, :-) but if it can get even one person to consider, and try synchronous RPC, it definitely will have been worth it.
Posted by: cajo on June 26, 2006 at 07:25 AM
-
Why do you insist that "asynchronous invocations cannot be made synchronous"? If I send a message on a Queue with a temporary Queue created and added as the response Queue, then wait for a response message from whomever processed my message, then it IS synchronous. If I then return whatever was sent back, then it's just synchronous RPC, but built on a nice messaging bus that gives location transparency and cluster-wide scalability to the processing of my (now synchronous) call.
In fact, it's pretty easy to implement this. It would be trivial to implement a proxy-based AOP scheme to make this transparent to the caller of a method.
On the other hand... implementing all of the semantics of a full-featured messaging provider, such as those outlined by dkfn, is much more challenging using RPC than implementing RPC with messaging would be.
Posted by: jcarreira on June 26, 2006 at 07:33 AM
-
I'm not nearly as attracted to Cajo as I am too your point, which is well taken and well said. Email as a particular implementation of RPC, exactly... and so is HTTP, with its own error handling. then on top of HTTP we place layers upon layers to arrive at various other RPC protocols.
where we promote the idea that the internet can be a collection of World Wide Virtual Machines; where remote objects are used just as local objects.
can you remember MPI? Corba? RMI? EJB? JINI? (there's even another one I cannot remember now, peer to peer with java). Now we've got Spring remoting, Hessian, Flash AMF, DWR (ajax), and Cajo. The problems I have (and my coworkers at various places of employment, all intelligent people) like the idea of remote objects used just as local objects. But the complexities of keeping the distributed system in synch (code version wise), and all nodes up and running leads us inevitably to arrays of identical boxes, all running the same components.
Back to MPI...what I remember most is that there were only a few problems that really suited distribution. they were all similar to that SETI thing, sending out chunks of data to be processed and returned to some central point. And I think all of these problems have similarities to the SETI project.
-no alien life has been found (no solution, I know, some of you think it's just around the corner...don't hold your breath)
-it doesn't make money, but only uses money.
I agree with your point. You're preaching to the converted, but I'm still not able to make use of this stuff in real life. The solution always results in either colocation (bigger box, more mem), or SOAP.
Taylor
Posted by: tcowan on June 26, 2006 at 07:37 AM
-
Hi Taylor,
Clearly there have been valiant attempts at synchronous distributed object interaction in the past. Of course, some of the earlier implementations were not so good. However RMI is really good! (and also very easy to use) BTW: All of these schemes to implement RPC over HTTP kind-of reminds me of radio struggling to remain relevant, in the age of television. HTTP will have its niche for sure, but its best days are behind it.
Non-local codebases can create a new set of problems, as you mentioned. With all this power and ease of use, there's gotta be a catch, right? Integration is difficult just on a single JVM application. It will require a rigor, and attention to detail, that is rarely practised today, and that may be one of its best side-effects.
RPC is not only about parallel processing in the stye of SETI@home. It can also be used to distribute a single application's load over several machines, vs. the more wasteful current practise of running the same application on multiple machines. It is also creating all kind of opportunities to share functionality between entities. The stuff from Google is just the beginning.
You can see it. The whole internet is in the midst of going through a profound expansion in capability. It is really exciting, and not without its challenges. It's time to sharpen our tools, try out some new ones, and do some experimenting.
Posted by: cajo on June 26, 2006 at 08:33 AM
-
I recall having a discussion on this very topic with another consultant while working on a trade reconciliation project for a large bank. Here's my 2 cents:
You can build sync. RPC over messaging, and messaging over sync. RPC. The proper frameworks, API patterns, etc., can make one look like the other. (And with Aspect-Oriented programming, you can make them absolutely identical on the surface.) Therefore, at the application-developer level, the plumbing should present the remote services in the form that the application developer wants. That may mean making messaging-based plumbing look like sync. RPC. Or vice versa. Or messaging looking like messaging. Etc.
Going a level deeper, to the actual plumbing, as others have pointed out -- it's a buy vs. build question. At the moment, there are a LOT of benefits for building on a messaging-based infrastructure, if only because you can find commodity tools to monitor queues, deal with threading and transactional issues, etc. Yes, you can buy a CORBA orb and get similar things out of the box on a sync. RPC infrastructure. But bottom line is, if you have a banking application, it is imperative to know where a particular request is in the system. In my opinion, messages and queues make this easier to manage than sync. RPC. HOWEVER, like I said, you can build such a tracking system on top of, say, CORBA. And that might be prudent in some scenarios.
Anyways, one other thing I'd like to bring up is the pattern that many messaging systems use, such as in a service-oriented architecture, that the messages being sent around represent snapshots of entire object graphs at points in time. That can be very helpful compared to distributed objects, in terms of decoupling dependencies. Again, it depends on the problem domain.
And that may sound wishy washy, but it's the bottom line: use the right tool for the right job. :)
Posted by: angben on June 26, 2006 at 09:02 AM
-
Hi John
that create literally millions of threads
millions of threads all executing at the same time? depends which VM you're using but a quick test suggests that you'd run out of memory after 7190 concurrent threads with default configuration on my machine.
but asynchronous invocations cannot be made synchronous
request / response asynchronous invocations can be made synchronous.
I notice you didn't answer any of my other questions regarding cajo.
Posted by: dkfn on June 26, 2006 at 10:09 AM
-
John -
You do like to be controversial!
Ok, I think there is the danger here that such gross generalisations lead to such heated arguments. I personally work a lot with Jini, and I know (particularly when Jini was SCSL) your opinions there. But getting back to the crux of the post; for me, the difference typically comes from the benefits (and concerns) of breaking the call stack, and removing direct references.
For instance in one of my projects, Neon, it breaks the call stack directly after you call the method (via dynamic proxies) so that agents can move around the network but still be called via the channel without needing to perform lookup. This cut is done via calling out to a dynamic messaging system where portions of the messaging substrate are split across the network; it makes this synchronous by blocking until a message arrives on a reply channel, so that you do have the illusion of synchronous calls. But Neon can automatically make asynchronous interfaces for agents or allow you to call any agent in an asynchronous manner very easily without changing the ultimate destination class.
My point is that there are advantages to both ways of doing things, and issues, the main thing being the tradeoff between the flexibility of messaging against the cost of having to manage the loss of a direct call-stack
But one other thing; ultimately TCP is sent as packets and another word for packets is.... 'messages'
--Calum
Posted by: calum on June 26, 2006 at 11:02 AM
-
dkfn,
> millions of threads all executing at the same time?
Oh no of course, not all running at once :-)
When I first found out that RMI created a thread per invocation, then discarded it; I wondered how it could possibly be efficient, or fast, but it really is, both! A thread per invocation is a really powerful resource.
> request / response asynchronous invocations can be made synchronous.
You can send an asynchronous message, and you can wait for one in return; but that only tries to simulate a synchronous invocation. You have no guarantees that your message will be processed immediately, or even when your return message might come, in fact it might even be invalid by the time it arrives. All you are guaranteed, is that it will be much slower. (mail vs. telephone) Of course, if you really want, you can exactly duplicate this type of pseudo-synchronous invocation with synchronous RPC. Again, sync RPC can do everything messaging can do very simply, plus a lot more things, that messaging can't even touch. Hence messaging is a degenerate form.
> I notice you didn't answer any of my other questions regarding cajo.
That was a lot of questions; in short, all of those are things are either already implemented, when they would require more than just a few lines, else they are left to be done in an application specific context.
It can even accept and return Fully Functional GUIs, actively synchronously connected back to their server objects. Put another way; we're essentially comparing apples... to an entire farmer's market! ;-)
John
Posted by: cajo on June 26, 2006 at 11:39 AM
-
Hi Callum,
> You do like to be controversial!
Haven't you heard; when a topic ceases to be controversial, it ceases to be interesting. ;-)
> Ok, I think there is the danger here that such gross generalisations lead to such heated arguments.
I don't think of it as a generalisation, nor even gross; rather, it is a theory of mine. I won't go so far as to call it a fact yet, but there sure is a lot of evidence to support the assertion.
Of course I hope it can be remembered, that this is just a friendly exchange of ideas.
> I know (particularly when Jini was SCSL) your opinions there
OK, that was heated, but we all got better :-)
Neon sounds interesting, I'll have to give it a closer look!
> But one other thing; ultimately TCP is sent as packets and another word for packets is.... 'messages'
OK, but on that philosophical note, I get my turn:
We could all get by comfortably, in applications without messaging; but we'd all be dead in the water, without procedures.
John
Posted by: cajo on June 26, 2006 at 12:24 PM
-
I don't think of it as a generalisation, nor even gross; rather, it is a theory of mine. I won't go so far as to call it a fact yet, but there sure is a lot of evidence to support the assertion.
Really? Where would that be? All I've seen in this thread is evidence to the contrary. In fact you've consistently ignored people's arguments and questions which disputed your assertions.
Oh, and this:
You can send an asynchronous message, and you can wait for one in return; but that only tries to simulate a synchronous invocation. You have no guarantees that your message will be processed immediately, or even when your return message might come, in fact it might even be invalid by the time it arrives. All you are guaranteed, is that it will be much slower. (mail vs. telephone) Of course, if you really want, you can exactly duplicate this type of pseudo-synchronous invocation with synchronous RPC. Again, sync RPC can do everything messaging can do very simply, plus a lot more things, that messaging can't even touch. Hence messaging is a degenerate form.
...was a classic example. Why would it be slower? What if I'm making 1,000,000 calls per second... With a Queue at least any of the machines can be picking up the messages and responding. With RMI I'm talking to one machine, unless you build fancy cluster-aware proxies, which are really just an admission that you should be using Queueing. Also, Queueing gives you a the nice ability to save up the messages during peak load and catch up in a few seconds when the load slows down... That as opposed to blowing up and stopping altogether.
How is this any less guaranteed than an RPC call over RMI? Ever heard of RemoteException?
What are these "more things, that messaging can't even touch"?
Checking my calendar to make sure April 1st didn't sneak up on me somehow...
Posted by: jcarreira on June 26, 2006 at 01:07 PM
-
OK Jason, one more time, as clearly and succintly as I can make it:
I assert: Everything messaging can do, RPC can do equivalently. Additionally, RPC can do things, synchronous remote procedure calls by definition, that asynchronous messaging cannot. That's it. That's my point.
To disprove my assertion that messaging is degenerate, there need only be one instance, of something messaging does, that is not already in the superset of what RPC can do. That's it. That's all.
> Why would it be slower?
OK, let's try a proof by contradiction, assuming all other factors being equal:
RPC gets a dedicated thread to process its requests immediately, and return the results immediately. Unless the messaging system is going to allocate one thread per incomming message, and use that same thread to also send the result message back, it has to be slower. However, if it did allocate one thread per incoming message for processing and message return, it would be a synchronous Remote Procedure Call, by definition.
I call my position a theory, because I would estimate that a typical RPC system has hundreds, if not thousands of concurrent procedure threads, dynamically allocated to match the load. I would guess a similar messaging system has at least an order of magnitude less threads, irrespective of the load. It's precisely because messaging is asynchronous; the timeliness pressures are relaxed.
> What are these "more things, that messaging can't even touch"?
As I mentioned previously, with RPC I can remote a user interface to a client, and have it function as if the server were running locally. On the contrary, there does not exist an AJaX user interface that even approaches the usability of a local Swing app. Further, by the arguments above, I contend that it is impossible, in the asynchronous message domain, even with the aid of ECMAScript.
Now, just to be sure, you do understand this discussion is just for fun, right? ;-)
John
Posted by: cajo on June 26, 2006 at 02:19 PM
-
Ok, one thing messaging can do that RPC can't do efficiently without rebuilding a messaging infrastructure:
Publish and subscribe messaging with Topics
Any implementation of that with RPC will be re-implementing a messaging system.
For the slower argument:
Let's say you have a 4 CPU server. Let's assume there's some number, N, of threads that one CPU can most efficiently use. What I've seen in the past is that this is typically on the order of 4 or 5, so taking it as 5, 20 active threads is the most efficient use of the 4x server. Now, with a messaging system, I can register 20 listeners to the Queue. As the load scales up to and beyond what 20 active listeners can handle at once, you'll continue to get the maximum throughput of the system, since 20 active listeners is, by the definition above, the most efficient use of the resources.
For the RPC system, if it really allocates a new thread per request, then after you get over 20 active clients, your throughput will start to drop off. As they contend for CPU cycles, the context switching will start to make the system thrash as the load continues to increase. With Queueing, the requests have to wait, but your throughput hits the maximum and stays there, whereas with the RPC solution, the throughput peaks then starts to drop off.
Of course, that's silly... I'd be surprised if the RPC solution isn't going to queue things, even if it's only in terms of NIO which is similar to queueing and breaks the thread-per-request model.
Oh, and for the "relaxed" timeliness pressures... Go ask a financial trading exchange about how nice and relaxed their requirements are and tell them how they should switch from their tried-and-true asynchronous messaging infrastructures to a synchronous RPC architecture... Should go over great!
Posted by: jcarreira on June 26, 2006 at 07:18 PM
-
>However, it is easy to forget: sending email is still >synchronous RPC! No matter how you slice it: data must be >sent from one computer to another; over the network.
John, I think it's pretty hard to dig yourself out of this one :-)
Using TCP or even RPC between each hop does not make the end to end communication synchronous RPC.
There are plenty of use cases like pub-sub where synchronous RPC is not applicable. Reliable messaging with two hops is another simple case.
Only thing you can say is that you can build the messaging system using RPC. But we already know that.
Posted by: manjuka on June 26, 2006 at 08:38 PM
-
> and for the "relaxed" timeliness pressures... Go ask a financial trading exchange about how nice and relaxed their requirements are
Oh, really... ?
I happen to work in scentific computing; where even one millisecond delay is unacceptable.
However, let's review some of these major financial trading sites:
NASDAQ: 15 MINUTES delay.
AMEX: 20 MINUTES delay.
NYSE: 30 MINUTES delay.
This is shockingly lax; So how do you defend that?
Posted by: cajo on June 26, 2006 at 09:01 PM
-
Hey Manjuka,
This has been way more fun than I thought! ;-)
> Using TCP or even RPC between each hop does not make the end to end communication synchronous RPC.
Well, we need to take that statement apart a little. TCP guarantees that your message made it from one host to another, without regard to its final destination. On the other hand, RPC is not a hop to hop protocol at all; it is a higher level abstraction, from one JVM to another. It actually guarantees your invocation both made it there, with all its argument data, but has now returned with synchronous result data, if any.
Posted by: cajo on June 26, 2006 at 09:28 PM
-
TCP guarantees that your message made it from one host to another, without regard to its final destination. On the other hand, RPC is not a hop to hop protocol at all; it is a higher level abstraction, from one JVM to another. It actually guarantees your invocation both made it there, with all its argument data, but has now returned with synchronous result data, if any.
I didn't say RPC is a hop-to-hop protocol. Unless you expect the recipient's email client to be up all the time, there will be several 'hops' of RPC base communication needed (not the same as TCP hops). End-to-end, it is not synchronous.
Anyway, rather than dissect a single statement, you should tell us how to do reliable pub-sub and reliable messaging with cajo!
Posted by: manjuka on June 26, 2006 at 10:32 PM
-
Well Manjuka, since you asked,
I am working on an interesting publish/subscribe mechanism for the cajo project. While it is still a beta, and not officially part of the project yet, you can see the source here.
What it does is maintain a multi-dimensional rulespace, which functions much like a semantic ontology in a conventional rule engine. However, in this case, the factbase is not local, rather resident in each of the subscribers own memory. This avoids the central memory overload problem, affecting conventional rule engines.
Both the subscribers, and the publishers can define the ontology at runtime. This probably sounds a little abstract, in fact, I think this is the most conceptually abstract class in the project, closely followed by the Data class. Fortunately it is a very small class, which makes it easier to understand.
Perhaps a simple example would help. Let's say there are three subscribers, one listens for facts asserted about Hummer H1 vehicles, the other H2s, and the third H3s. Of course there is no reason the three subscriptions could not be by the same rule object.
Now let's assume by mutually agreed convention, that the ontology places the H3 and H2 under the keyspace: Automobiles, Trucks, Sport Utility, Large; and the H3 under Automobiles, Trucks, Sport Utility, Medium.
Now a publisher can assert a fact that all large SUVs have high fuel consumption. The H1 subscriber would get a notice that H1s have high fuel consumption. The H2 subscriber would get a notice that H2s have high fuel consumption. The H3 listener will not get a notice. Each subscriber infers on the facts it has received, and acts accordingly. Normally a subscriber would listen for several types of facts.
The engine even supports publication and subscription via wildcards, for example, to subscribe to all truck facts, or to publish a fact about all automobiles. The rule engine itself is serialisable, so it is even possible to send the ontology, and all of its subscribers, to another machine, or persist it to disc.
It makes it quite easy to support advanced distributed declarative programmatic functionality. It makes the system response so intelligent; I am a little worried the network might become self-aware ;-)
John
PS Disclaimer: that last bit was purely humour; let's not have everyone re-flame up this already well-toasted thread. :-)
Posted by: cajo on June 27, 2006 at 07:14 AM
-
Umm... both my parents are scientists... Nothing is more lax than a scientific environment.
Large financial institutions, however, tend to be rather strict with the money flowing through their systems. Just because YOU have a 20 minute delay seeing the data doesn't mean the internal system does. Try a forex market.
Any of these will have orders of magnitude more data flowing through them than the average app.
Oh, and I'm glad you've seen the light on messaging, since you're implementing a messaging system. Not sure why, though... Probably better to just integrate ActiveMQ or some other JMS provider.
Posted by: jcarreira on June 27, 2006 at 07:23 AM
-
> Umm... <snip>I'll pass on the troll bait.> Probably better to just integrate ActiveMQ or some other JMS provider.As I am now finished trying to explain to you; I have no need for degenerate functionality: I have messaging's successor, synchronous RPC.
Posted by: cajo on June 27, 2006 at 08:16 AM
-
Ok, I guess I'll stop trying to show you that what you're building is the wheel called "messaging" and it's been built before. Just because you're building it with RPC doesn't make it not messaging, as several of us tried to tell you with your email example as well. Once you decide that what you call it doesn't matter, and realize you're duplicating something that's already a mature and freely available set of functionality, I suggest ActiveMQ.
Posted by: jcarreira on June 27, 2006 at 08:51 AM
-
*sigh*
First, just right off the bat, the delay from the trading sites is BY DESIGN. They delay the traffic in order to provide value to those who actually do need the direct, and current feed. If you want the real time access to the financial data from the major exchanges, give them a call and they will happily provide you with one after meeting some level of requirements and paying a fee. It's a feature, not a bug.
Messaging vs Sync RPC is a debate about design, not implementation. Pretty much EVERYTHING in a modern computer system, particularly one running on a single CPU, is running syncrhronously. The magic of the OS hides the syncrhonous nature of the CPU from us by going at blazing speeds, giving us the illusion of simultaneous activity.
So, obviously Messaging systems are based upon Sync RPC, at the lowest level of implementation. But arguing the point that because of that implementation detail Messaging == Sync RPC is silly. You may as well argue the OOP is the same at generic standard Structured Programming, because, after all, they're all just subroutines in the end anyway.
Messaging is different from Sync RPC because you approach the entire design differently. The assumptions your system makes is complately different.
While you may be writing a Messaging system on top of an eventually Sync RPC implementation layer, that doesn't mean you're writing what most folks consider a Sync RPC system.
So, feel free to conflate Messaging with Sync RPC, but this kind of handwaving doesn't really advance the debate one way or the other.
If nothing else, it muddies it with irrelevant details.
Posted by: whartung on June 27, 2006 at 05:24 PM
-
Thanks for joining in the fun Will. :-)I agree completely, messaging vs. Sync RPC is entirely about design. I find in most cases that messaging complicates things needlessly, by essentially ignoring the very practical reality, that procedures execute synchronously.I really can't understand the responses that Messaging == Sync RPC either. How can something so obvious, be unclear to some people: (Messaging != RPC && Messaging < RPC) == true;John
Posted by: cajo on June 28, 2006 at 06:18 AM
-
Gahh... are you sure you work in scientific computing? Scientists, after about the 4th or 5th time you show them an example of how their theory doesn't match reality, will eventually understand that they need to CHANGE THEIR THEOREM. Go back and read the comments again, try to see where people have pointed out the problems with your theorem, and see if anything sticks...
Posted by: jcarreira on June 28, 2006 at 08:18 AM
-
Messaging is a very powerful paradigm.
It presents a lot of advantages.
For a messaging based framework please refer to
http://jt.dev.java.net
Posted by: wwwfswcom on August 19, 2006 at 04:20 PM
|