The Source for Java Technology Collaboration
User: Password:



Kohsuke Kawaguchi

Kohsuke Kawaguchi's Blog

Rant: I found Subversion immature

Posted by kohsuke on July 26, 2005 at 11:04 PM | Comments (10)

I just had a frustrating hour or so with Subversion. No, it's not that I have problems with its functionality (well, I actually do, but today isn't time to talk about that.) It's the lack of craftsmanship that bothers me.

Firstly, the proxy support. One of the big benefits of Subversion is that it can use HTTP to talk to the server. So one would hope that the network connection set up with Subversion is easier than CVS, right?

Well, not at all, I found.

First, as a Windows program, it should be using the platform proxy setting (the one you set through IE's connection setting dialog.) It's a fairly sophisticated mechanism that covers wide range of use cases. It's also reasonably easy to use from programs --- you just need to use WinInet library instead of the socket library. Or as an Unix program, you can take $http_proxy variable, which seems to be the de-facto standard of setting up a proxy. Instead, Subversion decided to invent its own way of setting proxy information. This makes it really painful to switch one network configuration to another.

Second problem. Network connection problems are one of the most common problems, because there are just so many things that can go wrong. So a program should be able to help users diagnose the problem. With CVS, you can use the -t option to trace the network access of the CVS program. You can see which host/port it's connecting to (if it's pserver), or you can see how CVS spawns the connect program (if it's ext.)

Subversion doesn't have any such option (in fact it doesn't seem to have any global option, so I might be missing something.) When there are so many places you can set network configuration (~/.subversion, registry, ...) this is just poor craftsmanship. If Subversion had a trace option to cause it to print where it's loading proxy information, how it's connecting, and what repsonse it's getting, it would save a lot of time for many users.

Third problem. In theory, HTTP-based connection support would have improved the connectivity. But in practice, because Subversion decided to use the WebDAV protocol, it uses many HTTP methods (like PROPFIND) that are often not allowed by a proxy server. A simple Google search reveals how pervasive this problem is. While I'm sure the use of WebDAV makes some technical sense, it would have been a lot easier to us users if it just uses a standard GET or POST method coupled with the Subversion-Action header or something (guess SOAP-Action header is done for a reason!)

Fouth problem. Of all the modern programming languages you can choose to implement Subversion with, they chose C. I mean C, the least productive programming language of all kind, that only second to the assembly language. Sure, it's necessary sometimes, like when you are writing a kernel, or a really high-performance computing. But Subversion is neither.

Users would have been served much better if the time of Subversion developers are spent on improving the tool itself, instead of fixing string manipulation bugs, tracking down core dumps, and etc. I really don't understand why they didn't pick Python, Ruby, or even Java. It makes Subversion runs on more platforms, it improves the productivity of developers.

You see, none of those are critical to the architecture of Subversion or anything. It's just the rough edges that you need to smooth out. It's really nothing but a lack of craftsmanship to be unable to remove this many issues after so many years of development (I hit all those problems within an hour.)

Well, now that I said all I wanted to say, at least I feel much better!


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first)

  • For your first point, why not suggest that as an RFE or discuss it on the developers' list? It's an open project, and they're also quite open-minded. From the technical viewpoint, they use the neon webdav library, maybe that has something to do with it. They're currently redesigning configuration "autoprops", it seems like a good time to contribute.

    For the second point, I prefer that Subversion concentrates on being a good version control program, instead of a good network monitor. You can easily install more sophisticated network monitoring software if you want.

    I think the WebDAV is a VERY good idea (note that you can use "file" or "svn+ssh" protocols if you prefer). In general, it doesn't require opening ports other than 80 (unlike normal CVS), so as you need to adapt your firewalls and proxies anyway for CVS, I don't see this being any more or less problematic (maybe I'm just lucky with network configurations). Concerning why I think WebDAV is a very good idea, it's because it lets web designers, content authors, and so on, mount WebDAV folders on Windows/Mac OS/probably other systems and edit version-controlled files as if they were on their local filesystems. Dreamweaver users can for example just publish files that way, they don't need to jump through hoops as with CVS.

    As for the choice of C, they presented their arguments on their website at one time. The choice of C doesn't require any runtime other than the bundled Apache APR runtime, making it as cross-platform as Apache 2. The world doesn't just use Java, so I think it was a valid choice. As they said though, any choice is almost always subjective at some point and they knew people would ask "why didn't you develop it in (your preferred language here)?". You can use Java SVN if you want, it was developed as a pure Java layer (used in IntelliJ IDEA 5 for built-in Subversion support).

    I agree, it would be nice to improve configuration settings, but they're working on it, and even if automatic detection of proxy settings would be nice, you can get by at present. In the mean time, no file corruption by default on EOLs, no problems with Unicode, no hassle with keyword substitution, no corrupted binaries, a much cleaner alternative to branches (cheap copies), and we can benefit from an upgrade path using CVS2SVN (and it's not written in C...) to get atomic commits, reduced network round trips, decent file/directory moving and renaming.

    Posted by: chris_e_brown on July 27, 2005 at 01:09 AM

  • My main gripe with Subversion is that it still relies on the user documenting merges. The fact that you have merged changes from some other branch should be maintained by the source code control system itself.

    Posted by: mthornton on July 27, 2005 at 02:59 AM

  • The roadmap suggests that merge tracking is in line for big improvements.

    I'm likely to move our company's main repositories from CVS to Subversion soon, after some successful pilot projects. It's still a young product, but seems very reliable with an intelligent roadmap and an active community moving it forward, so these things should get ironed out. In the mean time, it's solved a lot of CVS' limitations... Developer tools seem to be offering very good integration now, so it's a valid option. For us, at least! :-)

    Posted by: chris_e_brown on July 27, 2005 at 04:40 AM

  • Actually, SVN has a number of Python scripts for post-commit hooks etc. I think the idea behind using C was a "least-common denominator" approach. SVN can be ported to platforms where Java/Python etc just aren't available, and it relies on a well-tested, very solid, technology.

    With regard to proxy support, I can't say much about it: I haven't worked behind a proxy in years. However, I wouldn't suggest reusing plain HTTP for every possible purpose -- I prefer a "best protocol for the job" approach, and inevitably HTTP is best for simple request-response interactions. Frankly, proxy vendors, firewall suppliers and sysadmins need to get smarter about port blocking. Computer using have 2^16 ports; it seems a little daft to restrict everything to port 80!

    I wouldn't criticise the svn craftmanship too much. I've been running a small personal repository for 3 years now without much of a hitch. I flatly refuse to use CVS for anything now.

    Posted by: hopeless on July 27, 2005 at 07:40 AM

  • About proxy settings in Windows:
    There really isn't a "platform proxy setting". The proxy settings that can be set up through IE's connection settings dialog are specific to the WinInet library. Any application that uses WinInet will share the proxy settings but there are many reasons not to use WinInet.
    WinInet was developed as part of IE. This has two consequences. First, WinInet is distributed and rev'ed as part of IE not the OS. Second, WinInet was not conceived of as a general purpose HTTP stack and it makes assumptions about being used as an HTTP client in a desktop windows application (like it assumes by default that it can display dialogs to the user).
    WinInet is inappropriate for use in a server type application and Microsoft specifically warns against using WinInet in a Win32 Service. (A Win32 Service is the Windows equivalent of a Unix Daemon.)
    WinHTTP is a newer and more general purpose HTTP library provided by Microsoft. WinHTTP is standard in Win 2003, Win XP SP1, and Win 2000 SP3.
    However WinHTTP doesn't use the WinInet proxy settings. Instead proxy settings for WinHTTP can be configured with the command line ProxyCfg.exe tool.
    There's also the "HTTP API" which is currently only available in Win 2003.
    Given the issues about which library is most appropriate or even available, it's not surprising a lot of cross-platform software opts to use an independent HTTP stack.
    Also note Java on Windows doesn't use WinInet for HTTP but if IE 5 or later is installed Java can be configured to fetch and use the WinInet proxy settings. (Prior to IE 5 WinInet didn't expose an API for getting the proxy settings.)

    Posted by: jdodds on July 27, 2005 at 07:58 AM


  • chris_e_brown, I think you are right that I should scratch my own itches. After all, Subversion is open-source!


    I noticed that their FAQ sugests using Ethereal for troubleshooting. Sure, that would work, but see, that's too big a gun for a problem like this. I agree that "seeing requests and responses" are bit too much, but just showing where config info is loaded from would go a long way, IMO.


    If the svn server also supports the WebDAV protocol, while the svn clients use a different protocol (enveloped in HTTP), then I think we get the best of both worlds.

    Posted by: kohsuke on July 27, 2005 at 10:03 AM


  • hopeless, you are very lucky that you aren't behind the proxy. Unfortunately Sun is (but there's a good reason, I'm sure.) I also agree that I shouldn't critize the svn craftsmanship too much, but hey, this was a rant.


    jdodds, thank you very much for informative information about WinInet and other libraries. I learned a lot!

    Posted by: kohsuke on July 27, 2005 at 10:08 AM

  • Re: proxy problems. If the server supports HTTPS then you can probably use that to tunnel through the proxy without it blocking any unknown methods like PROPFIND. I had the same problem and that solved it well.

    Posted by: nicklothian on July 27, 2005 at 04:12 PM

  • I've sucessfully used subversion through the Sun firewall using https. Note that the version of svn that ships with OSX doesn't support https (because it uses the same http stack as Finder, which also doesn't support https webdav). You should download a new client directly from the Subversion guys. I also had http issues when I was attending JavaOne, which is what made me to the switch.

    - J

    Posted by: joshy on July 28, 2005 at 01:05 AM

  • This discussion is ODD.
    HTTP: The communication protocol should be abstracted out of the Source Control problem space and completely independent. The independence should allow the opportunity to switch out connection and connection types. ODD that this is even an issue.
    Open Source: Eclipse 1.0 is horrible compared to Eclipse 3.0. It takes time for products to mature, even open source products. ODD that this is not expected.
    Language: People can and well make an argument for any language. With Java's present state of maturity, there is NO reason to still be using C/C++ on a project of this type. The use of C/C++ demonstrates more of a preference, than a reality of computer science. ODD developers still defend the merits of C/C++.

    Posted by: malcolmdavis on July 28, 2005 at 12:26 PM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds