The Source for Java Technology Collaboration
User: Password:



Daniel Brookshier

Daniel Brookshier's Blog

If you already know about a P2P service, is that bad?

Posted by turbogeek on March 02, 2004 at 02:12 PM | Comments (3)

I’ve had the same conversation with four different people this month. The conversation started when I was showing how a well-known ID can directly connect two peers. All four made the same observation: Doesn’t that go against the idea that P2P is about creating ad-hoc networks and the dynamic discovery services and resources?

They have been lead to believe that P2P applications should not start with any information and should only be allowed to discover things in the network. They fell in love with the idea of discovering treasures by looking in hidden places. Hard to tell if it is wunderlust or just an old idea that stuck in thier minds from old assumptions. The P2P literature has not helped much either because most discussed P2P in terms of searching. This is not surprising, considering that most of the current P2P is oriented toward file sharing where 'all' data is 'discovered'.

The second thing that many in P2P talk about is discovery of a service as opposed to a file. This is what I call a inverse software agent. Software agents usually get passed from computer to computer to do work. The inverse agent is one that you discover some place else and use on your PC to interact with other agents. An example might be a datin service that you discover and then use to interact with others that are single. The inverse agent makes sense, but there is very little support for it. Yes, it is a good idea, but the software and security support is not there. It is thousands of times more efficient to just write and distribute an application rather than a plugin to extend your network capabilities. An application that genericly finds and loads applications is overkill unles it is very generic - at which point it then loses its value. Catch-22.

Making all applications start with a search at a higher level of framwork is also not a JXTA strength. JXTA is multipurpose. It is not partcularly useful as a search tool(at least without a specific type of search). It is not even optimized for searching. Yes, there are search mechanisms, but they are very limited.

There is one area of search optimization that the JXTA team is working on very hard: Finding another computer listening to a pipe address. Simply, a computer that wants to accept information or create a two-way conversation. That’s what most network applications do, so that’s a reasonable area to optimize. That also means that even if we discover the application somehow, we still have the system to discover other computers in a P2P network.

But why throw away the purest view of ad-hoc networking? Well, I am not throwing it away. Just putting it back where it should belong. Applications should be applications - not a service.

Lets talk about the real crux of the argument, the well-known ID and how it is not a violation of a dynamic P2P network.

A Priori ID / Well-Known-ID

I was once taught a Latin phrase that has stuck with me but I have found difficult to follow all the time: “Eschew Obfuscation.” Translated it is simply, “avoid complicated language”. The problem with this sage advise is that when a new technology comes around we get ne buzzwords too. In JXTA, a well-known or a priori ID is an ID that is already known or can be created from known information. The best example of this is an email address. You know your email address, so does your email provider, and so do your friends. Because an email address is usually simple, like turbogeek@ cluck.com, it is simple for any application to require it as a way to identify you. In effect, your email address becomes an ID, plus it can be used for email.

But lets get back to confusing buzzwords for a moment. An ID in JXTA is a URN or Universal Resource Name. URN’s can have many formats, but they are just a unique identifier. The JXTA form of the ID is found everywhere in JXTA from groups to pipes and to identify versions and types of code or data. ID’s are everywhere in JXTA

But where is the discovery or ad-hoc cpability of this P2P system? Mainly in managing the network to discover peers and route messages. The network is created ad-hoc except for a few seeded peers that are used to discover the rest of the network. Beyond that, there is no requirement or preference that peers ‘discover’ information. Thus for well-know identifiers, the network is crreated on the fly to connect peers via an identical ID.

Discovery in JXTA, as I have said, is a little inefficient for data searches. There are many methods for creating efficient searches. I describe one earlier in a prior blog on a six degrees of separation search (a Kevin Bacon network search). The six-degree search used well-known ID’s created from email addresses to create connections between PC’s. Using the well-known ID increased the efficiency of the network.

This finnaly brings me back to the guys that think a known ID is 'not' a way to build an ad-hoc network. You should have seen that the ID is indeed a transiant thing - at least by clients when using an email address. The connection between peers is also ad-hoc and discovered via pipe routing whenever you are mathing peers with identical ID. In addition the ID can be formed to represent any subject, service, location, thing, or person. All you need is agreement between peers looking for the representation. Sounds like well-known ID are the poster boy for P2P.

I Dream of JINI

Perhaps it is the dream of JINI applications that lingers in the P2P purist's minds? If just walking into a room with a WiFi PDA connects you to a local printer is cool, why can't you apply the same thing to P2P? For example, if I log into JXTA, shouldn't it automatically guess I am not looking for a printer?

P2P is not as straight forward as JINI. Realizing that there is a printer is not the same as discovering a service that does printing and thus might be useful. To look at this another way, you could create aJXTA dating service. If we apply a JINI model, a PDA owned by a single person would aoutomatically load the dating service and start looking for a mate. Might be nice, but really a bit of overkill. On the other hand, if I run a dating application enabled with JXTA, I can use data about your likes and dislikes to create an ID that can match you to someone else like you.

The bottom line? Feel free to use JXTA ID’s created from known data. It is not against the spirit of P2P. The well-known ID is also an eligent way to solve many P2P problems. No one will call the P2P thought police. If they do, send them to me and I'll give them your Get out of Jail Free card.


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Big Brain Dump Response to your Article
    Since you are the one who created the Binary ID work for JXTA, it is cool to see you lay out your full vision for that code.

    I also didn't know that the JXTA team is focusing on optimizing search for pipe endpoints, rather than optimizing on searching for advertisements, for example. This is important to know. For example, when designing a JXTA based P2P app, such as a content management system, I would ask myself where I store and search for the files for a particular content site. I might store the actual content inside of JXTA advertisements, and then use JXTA's Rendezvous or Relay systems to search for those bits of content. However, if the JXTA team is optimizing pipes, then it makes more sense to advertise the existence of that content in an advertisement along with a pipe endpoint to find the content, but not put the actual content into the ad. Instead, I find the advertisement for where to find the content, then open a pipe to the peer that actually has the content.

    This is interesting, because it makes JXTA more about service discovery rather than using JXTA as a place to actually implement your services. JXTA becomes more of a directory service, where I search for what I need and find pipes to actually request the service, rather than a place to actually implement the service, such as putting instant messages between two peers into advertisements that are found using the Relay service.

    Does this hold in general for most JXTA applications? One of my interests is in vastly simplifying JXTA. If this is true, then we can use the JXTA framework mostly as a distributed directory service + firewall NAT traversal system, which is what P2P Sockets does (http://p2psockets.jxta.org). In P2P Sockets, once you resolve a well-defined endpoint (using your Binary ID work), you get back a pipe to communicate with that endpoint (made to look like a standard socket). Publishing a service is exactly the same, but looks like a standard server socket. The main issue at this time is that these "friendly" service endpoints, such as the email address you give in your article or "www.nike.laborpolicy" are easily spoofed. This might possibly be an unsolvable problem if we want the distributed directory service to be secure, decentralized, and easy to use for end-users.

    I also read your blog entry comparing JXTA to the Apple II. I think JXTA is more like the Commodore 64 or Tandy Color Computer; it shows the way to an exciting new way of building applications/using computers, but it is missing something that can take it mainstream like the Apple II or IBM PC did. I think part of the problem is that JXTA is simply too complex for developers to use; it is also too complex for other developers to reimplement in other computer languages, which is keeping it from becomming a standard. Both RSS and XML were relatively easy for developers to reimplement in other languages and systems, which aided their growth. You could build a system or incorporate into an existing system XML or RSS in a weekend or in several weeks. Compare that to JXTA; the JXTA-C team is still struggling to just get the TCP endpoints working, never mind having C-based rendezvous or relay peers. JXTA is more right than any other P2P framework so far, but it isn't right enough to go mainstream yet. I don't know what the answer is; P2P Sockets is simply one exploration of a possible solution, but it still falls short.

    You also discuss ad-hoc discovery in your article. The funny thing in JXTA is that everything, even Binary IDs, are really runtime "late binding" searches for an ID. Everything in JXTA is really ad-hoc. That is how it can be dynamic and recover from failures or changes, because every time you "bind" to a resource you are really sending out a search that might return a different way of connecting to that resource.

    I actually think well-known IDs are one of the pieces we need for vastly more usable P2P frameworks. It is much easier to write a P2P app that deals with human friendly strings (and much easier for end-users), then with 128-bit GUIDs.

    You also talk about Jini-like services for JXTA. If you look closely at P2P Sockets, that is kind of what we are trying to do, but using a service-description language that already exists: the HTTP based web. In the REST philosophy, or for web sites in general, we can "treat" any resource using a limited set of verbs: Get, Put, Post, etc. In essence, we can treat any arbitrary resource the same way Unix does, as a file descriptor. Even if that thing on the other side is actually an object oriented database, a camera running an embedded web servers, or an administration console, we can Get from it, Post to it, etc. We have a baseline abstract way of dealing with it. Why shouldn't we be able to do the same thing with P2P networks? Why should we have to delve into JXTAs overly complex Since you are the one who created the Binary ID work for JXTA, it is cool to see you lay out your full vision for that code.

    I also didn't know that the JXTA team is focusing on optimizing search for pipe endpoints, rather than optimizing on searching for advertisements, for example. This is important to know. For example, when designing a JXTA based P2P app, such as a content management system, I would ask myself where I store and search for the files for a particular content site. I might store the actual content inside of JXTA advertisements, and then use JXTA's Rendezvous or Resolver systems to search for those bits of content. However, if the JXTA team is optimizing pipes, then it makes more sense to advertise the existence of that content in an advertisement along with a pipe endpoint to find the content, but not put the actual content into the ad. Instead, I find the advertisement for where to find the content, then open a pipe to the peer that actually has the content.

    This is interesting, because it makes JXTA more about service discovery rather than using JXTA as a place to actually implement your services. JXTA becomes more of a directory service, where I search for what I need and find pipes to actually request the service, rather than a place to actually implement the service, such as putting instant messages between two peers into advertisements that are found using the Resolver service.

    Does this hold in general for most JXTA applications? One of my interests is in vastly simplifying JXTA. If this is true, then we can use the JXTA framework mostly as a distributed directory service + firewall NAT traversal system, which is what P2P Sockets does (http://p2psockets.jxta.org). In P2P Sockets, once you resolve a well-defined endpoint (using your Binary ID work), you get back a pipe to communicate with that endpoint (made to look like a standard socket). Publishing a service is exactly the same, but looks like a standard server socket. The main issue at this time is that these "friendly" service endpoints, such as the email address you give in your article or "www.nike.laborpolicy" are easily spoofed. This might possibly be an unsolvable problem if we want the distributed directory service to be secure, decentralized, and easy to use for end-users.

    I also read your blog entry comparing JXTA to the Apple II. I think JXTA is more like the Commodore 64 or Tandy Color Computer; it shows the way to an exciting new way of building applications/using computers, but it is missing something that can take it mainstream like the Apple II or IBM PC did. I think part of the problem is that JXTA is simply too complex for developers to use; it is also too complex for other developers to reimplement in other computer languages, which is keeping it from becomming a standard. Both RSS and XML were relatively easy for developers to reimplement in other languages and systems, which aided their growth. You could build a system or incorporate into an existing system XML or RSS in a weekend or in several weeks. Compare that to JXTA; the JXTA-C team is still struggling to just get the TCP endpoints working, never mind having C-based rendezvous or relay peers. JXTA is more right than any other P2P framework so far, but it isn't right enough to go mainstream yet. I don't know what the answer is; P2P Sockets is simply one exploration of a possible solution, but it still falls short.

    You also discuss ad-hoc discovery in your article. The funny thing in JXTA is that everything, even Binary IDs, are really runtime "late binding" searches for an ID. Everything in JXTA is really ad-hoc. That is how it can be dynamic and recover from failures or changes, because every time you "bind" to a resource you are really sending out a search that might return a different way of connecting to that resource.

    I actually think well-known IDs are one of the pieces we need for vastly more usable P2P frameworks. It is much easier to write a P2P app that deals with human friendly strings (and much easier for end-users), then with 128-bit GUIDs.

    You also talk about Jini-like services for JXTA. If you look closely at P2P Sockets, that is kind of what we are trying to do, but using a service-description language that already exists: the HTTP based web. In the REST philosophy, or for web sites in general, we can "treat" any resource using a limited set of verbs: Get, Put, Post, etc. In essence, we can treat any arbitrary resource the same way Unix does, as a file descriptor. Even if that thing on the other side is actually an object oriented database, a camera running an embedded web server, or an administration console, we can Get from it, Post to it, etc. We have a baseline abstract way of dealing with it. Why shouldn't we be able to do the same thing with P2P networks? Why should we have to delve into JXTAs overly complex ModuleClassAdvertisement, ModuleSpecAdvertisement, etc.? Why should we have to deal with Jinis mobile code? Why can't we simply have a collection of easy to contact endpoints, with domain names or easy names just like web resources, which can be contacted and spoken to with HTTP? Under the covers that HTTP may actually be going over JXTA pipes and traversing firewalls, but at the end of the day it looks like a stream. I get the resource using the well-known name, then start talking to the resource. I first talk to it using HTTP verbs; I could be Getting a file from that peer, getting a web page, getting the results of some computation running an embeded device, whatever. If I need to get more complex, I can start using XML-RPC or running things that look more like servlets that I Post to. This comes to the other side: where I implement my service. Why not use a technology that already exists for implementing services, called Servlets? A servlet simply takes a request, handles it, then gives a response; it doesn't care whether that request comes in from a peer-to-peer mesh or from a standard TCP HTTP connection. P2P Sockets includes a servlet engine that has been tricked into receiving all of its requests and sending its responses to peers on a peer network. You publish yourself to an endpoint, such as "www.nike.laborpolicy", and build a servlet that receives a request, such as "GET HTTP/1.0 www.nike.laborpolicy/somefile.txt", does something with it, then sends it back to the original peer. If several peers are servicing that name and can receive requests for "www.nike.laborpolicy", then you've now created something like a JXTA Peer Group Service, without having to read a book about JXTA (you just have to know how to work with standard Jetty and servlets).

    So the web itself becomes our abstraction layer that programmers "write" to, underneath which hides a P2P network backed by things like your binary ID work, Mohammed's JxtaServerSocket, and JXTA.

    The last thing we need is to actually return search to the mix. Just as a directory service can support retrieving something by its well-known ID, such as "someone@someemail.com", it can also support finding something by attribute, as LDAP does. So finding something indirectly by "attribute"/metadata or content is also needed, which JXTA doesn't do as well. We can also hide this and make it look like the existing web.

    One way to do this is to reuse a concept that programmers already understand: search engines. Why can't we define well-known endpoints that look like domain names, such as "www.jazzmusic.search", and which we can POST a search request to and receive back an XML or HTML list of "endpoints" (i.e. other P2P Sockets domain names) that have the requested search values? Under the covers, the P2P Sockets framework would know that any endpoint that ended with the word ".search" would actually mean to use JXTAs Rendezvous or Resolver functionality to search for advertisements that had the metadata you want in the Post request.

    For example, a future version of Paper Airplane (http://www.paperairplane.us), which is built on P2P Sockets will have something called the Paper Airplane Directory. This is a simple, Yahoo-style hierarchical directory of available Paper Airplane Groups, which are basicly just web sites with web style domain names, such as "www.boobah.cat". This directory will look just like a search engine that you can Post metadata to and search for metadata on. To indicate that you have a Paper Airplane Group that you just created for the category "Politics" and the sub-category "Campaign Finance Reform, you might send a standard HTTP Post request to "www.paperairplane.directory/politics/campaign_finance_reform?site=www.campaign.reform". If you want to see what sites are in this category, you would send an HTTP Get request to "www.paperairplane.directory/politics/campaign_finance_reform".

    Here's the cool part. Every peer in the Paper Airplane peer group is running the modified servlet engine and can receive requests for "www.paperairplane.directory" (i.e. it is a JXTA peer group level service). When it receives a Get or Post request, it has to actually talk below the P2P Sockets framework to JXTA and do a search request to JXTA Rendezvous peers to find advertisements for Paper Airplane Groups that have Politics and Campaign Finance Reform tags. Once it gets enough of them, it can use a standard JSP or servlet to process them together and spit out HTML or XML back to the "peer" that contacted this service. Since the original peer "contacted" this "website" through a browser that was configured to use the P2P Sockets P2P to Web Proxy, a local proxy that tricks browsers into thinking they are talking to normal web sites when in fact they are talking on the JXTA network, the original peer simply receives back their HTTP response as a nice web page they can display to the user, which then looks just like a standard Yahoo style page.

    So it breaks down to this. HTTP becomes our universal way to contact services and provides a low-level way to "talk" to any service; things that look like domain names become our standard way to actually contact these services; and HTML/Mozilla XUL becomes our universal, easy-to-use UI language. You can now build P2P application using the browser on the client side (though with P2P Sockets locally installed to intercept requests) using all your client-side knowledge and use servlets on the "server"/service side to handle requests. Underneath it all we use JXTA for the P2P primitives we need.

    Basicly what you have is something that looks like Universal Plug and Play (HTTP + HTML + SOAP), which is a competitor to Jini, without possibly having to learn very many new things.

    What do you think? The whole search engine portion is not coded; I feel like it could be much more simplified and made more universal.

    Posted by: bradneuberg on March 03, 2004 at 12:04 AM

  • Big Brain Dump Response to your Article
    Ooops, accidentally posted some sections into the text form twice.

    Posted by: bradneuberg on March 03, 2004 at 12:09 AM

  • Response to your blog entry
    Daniel, I messed up pasting my response above into the text form (I wish there was a preview button!). I posted the correct response to my blog at http://www.codinginparadise.org.

    Posted by: bradneuberg on March 03, 2004 at 12:13 AM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds