The Source for Java Technology Collaboration
User: Password:



Jan Haderka's Blog

Jan Haderka Jan Haderka is an independent software developer and technology consultant focusing on desktop and enterprise applications. He has been writing software for number of years and since 1995 he has been doing so for living, working for various companies from small startups to big corporations. Lately he has been involved in number of projects, Swinglabs and Magnolia among others.



SwingX 1.0 released

Posted by rah003 on June 04, 2009 at 06:34 AM | Permalink | Comments (0)

For all of you who could not come to the SwingLabs session at JavaOne yesterday, here comes the announcement again:

After many broken promises and many delayes ... The SwingX 1.0 is released and available for download at SwingLabs download page.

I would like to take this opportunity to thank to everybody who contributed to the release and for all the patches we have received. Namely: Jeanette, Luan, Jonathan ... thanks a lot. Specially I'd like to thank to Karl who took over lots of issues lately and worked really hard to make sure everything was ready for JavaOne. It was really nice feeling to be able to finally announce the release.

The usual technical stuff: You can read the release notes Release Notes before downloading if you wish so, and for those of you using maven this release should be in central maven repo in couple of days.

Maybe few more comments on this release and on what is likely to happen now (still open to discussion in the forum):

  • From now on, the Java 5 compatibility will no longer be maintained.
  • Actually even more strongly, all the compromises in the code and all the dependencies necessary because of Java 5 will be actively removed so that we can move forward with the code development.
  • That of course doesn't mean that you have to move to Java 6. If for any reason you are stuck with Java 5, you can use the 1.0 release and if you are willing to step up to the chalenge and maintain Java 5 compatible branch of SwingX and backport bug fixes to it, just let me or Alex know and we will make such a branch available for you and ensure it is built and deployed by hudson
  • If there are things that are currently in the incubator that you would like to see in the SwingX itself, let us know on SwingLabs forum.
  • Same way, if you have a code or component you would like to contribute to the project, submit a patch to the issue or put the component in the incubator and let us know in SwingLabs forum so it can be discussed and agreed upon.

Enjoy!



How many backup solutions is too many?

Posted by rah003 on May 07, 2009 at 06:51 AM | Permalink | Comments (1)

Or can there ever be too many backups? I don't think so. On the other hand very often I have seen people underestimating testing their backup solutions and restore procedures and discovering issues only in middle of a crisis when restore of previously made backup is desperately (and quickly) needed.

Just few examples of those:

  • Backup is seemingly running, but backup media is corrupted is a way that read of backed up data is not possible. (No, I'm not making this one up, I've actually seen it happen. For nearly a 6 months everybody was happy there is database backup running and they discovered the issue only when the backup could not be recovered from the tape.)
  • Backup is made of the corrupted data. Typical scenario here is to have automated backup, keeping history of backups few days back, and automatically replacing old data snapshots with new ones. If the issue is not in plain sight and is discovered too late, chances are, all the existing backups have already corrupted data in. (And yes, I saw this one happen too, and more then once.)
  • Backup is made periodically, it runs without any issues, but no one ever tried to do the restore, so there is no restore procedure in place for when the time comes to actually restore the data and in a worse case, there is some element missing or not functional in given configuration which makes it impossible to perform the restore. (In one case I saw this, the backed up data didn't include binaries stored outside of database, thus having all the meta data and relations between binaries backed up, but not the actual content of files). And actually even if the backup is alright, this is a dangerous scenario. When you are performing restore of the backup, it usually means something went wrong and everybody is if not nervous, then at the very least tense and prone to oversights and mistakes. Having tested procedure with clearly outlined steps describing what to do, when and how is in such a case worth of all the money.
  • Surely there are more scenarios you can think of. My general thinking about backup & restore procedures goes along the lines "If something can go wrong it will and you better be ready for this."

Why did I bring this topic up at all. Recently I was party to one such recovery. The public web site was running, but it was corrupted. One subtree of the site could not be updated any more due to data corruption. There were multiple public instances, but all of them suffered from the same problem. The good backup data were way too old (couple of weeks, is way too much for site updated on daily bases).

The only good thing was, that the authoring instance was still intact and working. Piece of cake you might think. Just set up new clean public instance and republish everything and you are done. Yeah, you might think that, but ... there were several "but"s in this case. To name just a few:

  • Existing configuration, while only partially updatable, had to be kept working and running and updated until replacement was ready.
  • The ongoing editorial process and automated data publishing made it impossible to disconnect existing public instance during process of creating new public instance. The content had to be pushed to the still working majority of the site.
  • The site was moderately big - couple of tens of thousands of assets (pages, images, proprietary data, etc.) and republishing all that takes some time.
  • And, the most importantly, a lot of pages have been already edited after last publishing, but not finished another editorial & approval round yet, to be ready to be pushed to the new public instance.

What was needed in this case was some solution to create new public instance, as close to the existing ones as possible, while pulling the data from author instance. Fortunately to all involved, Magnolia comes by default configured to version existing content upon publishing it, so we could use existing last versions for all previously activated content to recreate the public instance. This together with some magic to avoid changing the status of the content in author instance as done during normal publishing process, gave birth to synchronization module. Which as you see is just another way of restoring your public site in Magnolia, using authoring instance as a living source of backup data. There are limitations to this approach, like if you switch off the versioning feature on the author instance, module can synchronize only unmodified activated content. If you deleted piece of content, if can't be synchronized since it is completely gone (but this is fine, since deletions are always synched with public instance immediately).

Anyway the module was written, and tested and successfully used to create new public instances, that were already in sync with existing public instances and could be used as an instant replacement. Once all was done we asked ourselves a question - was it just one time job, or can we turn it into something more beneficial for everybody? Are there more use cases then the one we just used it for?

Here's my list of answers. If you are using Magnolia I would be interested to hear if you as a customer of the CMS see those as useful use cases for yourself, or if you think there are better ways to perform the tasks or (even better) if you see some more use cases that I missed.

  • The public instance is usually in the open net environment making it potential target for kind of people trying to break everything. In case someone hacks your public instance in any way, you can use the synchronization module to get back to some reasonable state without resolving to emergency measures like activating everything you have on your author, even if it was modified or is not yet ready.
  • Hw or sw failure causes you to loose instance. Might happen that for some reason your backup is not recoverable, this gives you another safety net.
  • Due to sudden increase in popularity (e.g. slashdot effect) your public server is under heavy load and you want to add another public instance. Replicating public from public that is already under load is out of question same as republishing everything to this existing public together with new public instance.
  • Variation of the above - your public instances are already moderately busy with current amount of visitors and you prepare launch of new product and expect spike in load and want to be ready for it. Again easier to push it from author then from public.
  • You want to create simple replicas of your public instance to test variations of new design/theme for your site. Again using current public to do that is inadvisable if it is under moderate load and in general, as making any mistake there might render your current web inaccessible. It is much safer to do this from author instance.
  • For technical reasons (hw upgrade) one of your public instances was out of business for while. In order to keep everything running, you have disabled or removed this instance from list of subscribers, but now to get it back into your pool of public instances, you need to re-synchronize it to ensure all the content published to other public instances in the mean time gets to this one as well. With XA you only ensure that content is on all or none of the active subscribers, it does not re-synchronize newly added (or re-enabled) subscribers with all the other instances.
  • And of course, the usual ones:
    • "because s**t happens" (TM)
    • "because we can" (TM)


Magnolia Cache in clustered environment

Posted by rah003 on April 29, 2009 at 12:21 PM | Permalink | Comments (0)

I might have mentioned something about cache in Magnolia here before, today let's look at another aspect of it.

While in general Magnolia follows well known and understood publish/subscribe model when it comes to page activation (the activation is always done in direction from authoring to public instance), there is one notable exception to this model - public generated content. This it the kind of content like forums, page comments, etc. This is the content that is created on public instance and resides on public instance only. No big deal you could say. Such public generated content is in another workspace, completely separated from the web content and only loosely connected in case of features like page comments.

Yeah right. Let's add cache to the picture.

The cache on public instance automatically flushes itself when new piece of content is published, to make sure users are not served stale version of the page. Still no issue here. Also when new page comment is generated, commenting module is going to instruct cache to flush the page for which comment was generated for since it knowns where the comment is coming from.

Still looks kind of OK. Let's add multiple public instances to the picture.

Yuck! We've got the issue, if I have forum or page comments deployed on multiple public instances and they are not aware of each other and of each others content, we've got an issue. The solution to this is quite simple, let's just connect our public instances into a cluster. Is that possible? Yes, why not ... as long as we use JCR implementation that is clusterable, everything should be fine. There are various reasons why one might not want to cluster all the workspaces, but use multiple repositories and cluster for example only the forum/commenting workspace in separate repository. You can do both in Magnolia and it is not the point of this exercise so I will not say more on this subject. What I want to explore today is how this affects cache ... which is not clustered in either case.

For the plain activation of the content, full or partial clustering is not an issue. When fully clustered, only one node from the cluster is subsribed to the author instance and author will publish content only to this one public. When content gets published, all cluster nodes will be notified about new content via events distributed in all nodes of the cluster and all node will flush their cache as they should. When publishing content to not clustered workspace, all of the cluster nodes need to be subscribed, so the author instance will publish content to each of them and again all of the public instances will get their own event notifications and flush the cache upon receiving the appropriate event.

Now about public generated content, and more specifically about page comments. We definitively want to cluster workspace into which they are stored so we can share them across all cluster nodes. If it is not clear from the above, the page comments are not part of the page itself, but are stored separately and only reference page to which they belong. So there is nothing to flush from the cache, based on event notification when new comment is added since the comment is not a page that anyone can see directly. Hence the original solution of telling the cache directly "hey go and flush page xyz since there is new comment on it, even though you don't know about it". Unfortunately this approach break as soon as clustering comes into the play. The commenting module can discover only cache that is local to it (running in the same cluster node), so it can at most flush given page only from the cache at give node, but not from the others. Now you see where the problem lies. We need to notify and flush affected page from all the nodes across the cluster.

What can do as an extreme solution, would be to turn the caching off for all cluster nodes. After all we have a cluster, and can add extra nodes if we need to scale up, so why to bother with caching. Somehow I don't think that is a good idea, but yeah, it would be a solution to the problem.

Another option would be to cluster the cache as well. The default implementation of Magnolia cache uses ehCache underneath, so this is quite possible, but it would complicate configuration of the instances. I was looking for some painless solution to the issue.

The solution I choose in the end was, to use existing event notification mechanism that works well in clustered environment and is already used for flushing the pages from cache on activation. So, now there is a flush policy implemented that listens for updates in forum/commenting workspace and if it detects page update, it goes and figures out if there is related page and if so, it will flush it from the cache. No big deal. Like usual in such cases, figuring out the right solution to the issue took longer then actually implementing it.

Below find the diagram showing interaction of page comments and cache in clustered Magnolia environment. As you can see when content is stored in the workspace, even notification is distributed to listeners on all cluster nodes. Appropriate flush policy that understands what is being stored in forum workspace (as we use forum workspace to store page comments) and how it affects pages is ensuring that cache will be updated as appropriate. And we have no need for direct commenting-cache interaction.

cache-commenting.png

Resources:



SwingX 0.9.7 Released

Posted by rah003 on April 05, 2009 at 12:11 PM | Permalink | Comments (0)

While writing last entry I certainly didn't expect to be making another release in a week time. Nevertheless, here we go. Due to serious regression in the calendar component, team has decided to push another release quickly, which means SwingX 0.9.7 is out.

The details of all issues fixed in this release are in Release Notes. Binaries are available at SwingLabs website. Release should be available in java.net maven repo over next couple of days.


Enjoy the release.



SwingX 0.9.6 Released

Posted by rah003 on March 29, 2009 at 04:49 PM | Permalink | Comments (3)

While not really planned originally, after a discussion in the forum, team has decided to push one more release before going for 1.0 and to take this oportunity for just one more sweep and cleanup of API. There were few other notable changes like cleanup of deprecated code and few more. Read more at wiki and in the release notes.

As always you can download the binaries from SwingLabs website.

Enjoy.



June 2009
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30        


Search this blog:
  

Categories
Community: JavaDesktop
Linux
Open Source
Tools
Web Applications
Archives

June 2009
May 2009
April 2009
March 2009
December 2008
November 2008
September 2008
August 2008
June 2008
April 2008
March 2008
December 2007
November 2007
October 2007

Recent Entries

SwingX 1.0 released

How many backup solutions is too many?

Magnolia Cache in clustered environment

Articles

Fling Scroller
Does your Swing work focus on "look" and not so much on "feel"? The gestures available to a user can make a big difference in how your UI is enjoyed. In this article, Jan Haderka introduces a new behavior to JLists to allow users to "fling" off the top or bottom of the list and have the scrolling continue briefly as a result of the gesture. Sep. 27, 2007

All articles by Jan Haderka »



Powered by
Movable Type 3.01D


 Feed java.net RSS Feeds