Thursday, June 16, 2011

Ten Caching Mistakes that Break your App

Introduction
Caching frequently used objects, that are expensive to fetch from the source, makes application perform faster under high load. It helps scale an application under concurrent requests. But some hard to notice mistakes can lead the application to suffer under high load, let alone making it perform better, especially when you are using distributed caching where there’s separate cache server or cache application that stores the items. Moreover, code that works fine using in-memory cache can fail when the cache is made out-of-process. Here I will show you some common distributed caching mistakes that will help you make better decisions when to cache and when not to cache.

Here are the top 10 mistakes I have seen:
  1. Relying on .NET’s default serializer
  2. Storing large objects in a single cache item
  3. Using cache to share objects between threads
  4. Assuming items will be in cache immediately after storing them
  5. Storing entire collection with nested objects
  6. Storing parent-child objects together and also separately
  7. Caching Configuration settings
  8. Caching Live Objects that have open handle to stream, file, registry, or network
  9. Storing same item using multiple keys
  10. Not updating or deleting items in cache after updating or deleting them on persistent storage
Let’s see what they are and how to avoid them.
I am assuming you have been using ASP.NET Cache or Enterprise Library Cache for a while, you are satisfied, now you need more scalability and have thus moved to an out-of-process or distributed cache like Velocity or Memcache. After that, things have started to fall apart and thus the common mistakes listed below apply to you.


Relying on .NET’s Default Serializer

When you use an out-of-process caching solution like Velocity or memcached, where items in cache are stored in a separate process than where your application runs; every time you add an item to the cache, it serializes the item into byte array and then sends the byte array to the cache server to store it. Similarly, when you get an item from the cache, the cache server sends back the byte array to your application and then the client library deserializes the byte array into the target object. Now .NET’s default serializer is not optimal since it relies on Reflection which is CPU intensive. As a result, storing items in cache and getting items from cache add high serialization and deserialization overhead that results in high CPU, especially if you are caching complex types. This high CPU usage happens on your application, not on the cache server. So, you should always use one of the better approaches shown in this article so that the CPU consumption in serialization and deserialization is minimized. I personally prefer the approach where you serialize and deserialize the properties all by yourself by implementing ISerializable interface and then implementing the deserialization constructor.

Read more: Codeproject