Skip to main content

Cached! ... again

Posted by rah003 on March 10, 2010 at 11:52 PM PST

I wrote about Magnolia cache few times already since it have been re-implemented for Magnolia 3.6. And it seems like with Sprint 4 of Magnolia 4.3 it came back to bite me.

There was a bunch of tickets related to various aspects of the cache. Most of it was related to the fact that the default cache key (only URI) was not enough for many installations which were using multiple domains (and customizing output based on the domain) or multiple languages and changing content based on the user locale (set in http header). This on its own was not a problem, since you could just re-implement cache policy, executors and flush policy, but since it was so wide spread we decided Magnolia Cache Module should support such use cases by default.

The changes necessary to fix issues described above actually fit nicely with the goals for Magnolia 4.3 release which was all about making multi domain and multi lingual configurations easier to use and more user friendly. And on top of that, I don't think that many people ventured into extending the cache and writing their own keys, policies and executors that would be more specific. Most of them just switched the caching off altogether and took the performance hit. Well that was fine as long as they could manage, but now there is a better way.

With the finalization and refining of multi domain support and with extensions to i18n, the structure of the key just was not enough anymore. The key need to be aware of the domain all the time as well as of the user locale no matter whether it was encoded in the URI or not. 

The cache key was also ignorant of the request parameters. Arguably not a problem, because the caching of dynamic content is not desired in most of the cases, and was switched off by default anyway. While this configuration stays, the key is now aware of the parameters and their values and such request can be cached now (of course only as long as the output produced by such queries is intended to be always the same, but the choise is now on the user).

Another weak point of current cache key have been its binding to the user generated content (UGC), such as page comments. Since page comments are rendered as part of the page and not retrieved via AJAX calls, UGC related features need to be able to flush the specific pages from the cache (those that have been commented upon). To do so, up until now the UUID have been used to get the page handle and the page handle have been assumed to match URI 1:1. This works for a simple case such as commenting, but not in general so the stronger cache key and mapping between key and UUID at the time of creation is necessary to make that feature work universally and irrespective to various content mappings.

Improvements to the cache key are visible also in this area. Magnolia now persists binding between cache key and UUID of the main content used to generate cache entry. This means one can now request all the content created from given resource to be flushed from the cache explicitly and cache will cleanup all the entries created from such resource, no matter with what domain, locale or parameters were used in the request when the keys and entries have been generated.

While testing the cache and various configurations and functions, I have also ran in the need to observe and tweak what is happening, so Cache Module in Magnolia 4.3 comes with own MBean and you can observe cache behavior via JMX console. And not only observe, you can also flush whole cache or specific entries using this managed bean. (Just to put this in perspective, there is of course also the MBean from the underlying cache engine itself - ehCache, but that is at a bit different level and is not aware of bypasses and various cache behaviors.)

As if all that was not enough, while writing flushing functions for managed bean, I thought I better extract that in separate commands and did so. The extraction happened for two reasons really, first it is no business of the managed bean to know the internals of calling various cache functions when they can be hidden elsewhere, and second such functionality might be useful elsewhere and there is no better way to expose it in reusable fashion then by putting it into the command. So Magnolia 4.3 also comes with 2 new commands for flushing either whole cache or specific keys, based on content UUIDs. This way if someone want to cache say output of dynamic queries and flush that content from the cache every 30 minutes or so, they can just configure scheduled job to invoke the command periodically without need to write a single line of code.

 

 Just to describe various attributes you can see from the MBean:

  • Bypasses - Count of all the requests deemed as not cacheable by the current cache policy
  • Puts - Count of all the requests deemed as cacheable by the very same cache policy
  • Hits - Count of all the requests already found in the cache
  • All - Just list of the above. Since it is possible to configure custom policies and custom behaviors, this value is here to show such extra behaviors (if any)
  • CachedKeysCount - Count of all the cache keys associated with some UUID (not all the cache entries need to have such association)
  • CachedUUIDsCount - Count of all unique UUIDs associated with some existing cache key. The difference between this number and the one above gives an indication of how many entries there might be originated from single page
  • DomainAccesses - Amount of requests coming from different domains
  • Flushes - Number of times cache was flushed since the server restart
  • StartCalls - Number of times the cache module have been started since the last server restart
  • StopCalls - Number of times the cache module have been stopped since the last server restart
AttachmentSize
jmx-mgnl-cache.png49.3 KB