Implementing a shared cache: Part 4

Almost two weeks since I posted Part 3, which means it’s probably time to wrap up my series on implementing KSharedCache with this post. As a recap for those who don’t want to skip to the end of Part 3, I said I’d talk about having to defragment the shared cache and porting KIconLoader.

Defragmentation

One of the sub-optimal parts of my implementation of KSharedDataCache is the fact that the data must be contiguous in memory. It would actually be fairly simple to change this due to the existing page design, but right now this is what we’ve got.

The reason this is sub-optimal is due to an effect called fragmentation (more precisely, external fragmentation). The effect is illustrated in the following diagram:

The problem with fragmentation can be seen in the cache layout at the bottom. The maroon blocks indicate allocated pages. In the bottom cache layout, even though the overall allocation is less than in the other cache, it is not possible to fit the new block in memory because there is no solid block of free memory of sufficient size.

This problem could be solved if only there were a way to move the allocated memory around. In the case of KSharedDataCache this can be done, because only the cache records the location of each data block (when an item is returned to an application, it is actually a copy). This process is called defragmentation, and is essentially the same idea as what disk-based defragmenters do.

My defragmentation is probably fairly naÃ¯ve but does the job, simply looking for used pages of memory and moving them as far to the beginning as possible. The more interesting part is deciding when to defragment. Right now defragmentation is performed as follows:

When we are having to remove currently-used pages due to insufficient consecutive memory when total free space is higher than a certain threshold, before pages are actually removed.
After evicting cache entries, since there are probably more holes in memory, and the reason we evicted things from the cache was due to insufficient consecutive free memory.

As I said my defragmentation routine is very simple and could probably be easily improved. I haven’t noticed issues with it becoming a problem during desktop usage but that’s perhaps attributable to not coming into use very often (if at all) due to the cache aging I mentioned in Part 3.

Porting KIconLoader

Perhaps the largest impetus driving me to do all this work in the first place was due to KIconLoader, which used to use KPixmapCache to cache loaded icons. KIconLoader is used everywhere in KDE and so many KPixmapCache-related crashes were first noticed in seemingly very-unrelated applications when trying to load icons.

Porting KIconLoader to KSharedDataCache was not a direct method name replacement unfortunately, as one of the things KIconLoader stored was the path that the icon was found at. (This used to be done by using the custom data API for KPixmapCache I mentioned in Part 1). I had first intended to use KImageCache (an image-related subclass of KSharedDataCache) for KIconLoader, but I ended up using KSharedDataCache directly, to hold cache entries that simply contained the pixmap and the path.

One problem that came up that I only fixed a few minutes ago was that KIconLoader would not only cache loaded icons, but it would also cache failed icon lookups, which was a behavior I had not ported over. This was especially notable in Dolphin browsing in large directories apparently. Either way, that is now fixed.

Future Directions

I’m proud of the work I put into KSharedDataCache, and especially since it has been running in trunk for about 4 weeks now with no major issues that seem to have popped up. However there are quite a few things that could be improved about it:

Cache corruption: Although in my opinion the risk of crashes from a corrupted cache is less right now due to the cache layout and non-usage of QDataStream, the possibility is not zero. Because of the serious consequences of cache corruption leading to crashes, it would be nice to have an efficient way to mark that a disk cache should be deleted because it is corrupt. I’ve thought of some things but have no concrete plans at this point.
The page table/index table method of storing data is very simplistic. There is surely a more appropriate method buried in some ACM or IEEE publication somewhere, even in the limits of fixed memory size. As it stands my method blends some of the disadvantages of 1960’s-era memory allocators with paged memory allocators, without all of the benefits.
Assuming defragmentation remains required, the defragmenter could probably be made faster as well.
It is not at this point possible to resize a cache once it has been created. There’s no reason in theory that it can’t be done, it’s just not implemented. (Note that implementing this is more complicated than simply changing a size flag in the cache…)
The cache could possibly be made more concurrent with lock-free algorithms or finer-grained locking. This is not something I’d like to touch until I have a way to verify correctness of the result, however.
Finally, it possible that someone has done this way better and that I simply missed it, in which event we should look at whether we should just adopt that library as a dependency and make KSharedDataCache a wrapper around it.
Should we remove old KPixmapCache caches after starting up a shiny new 4.5 desktop?

So, this concludes my series on implementing a shared cache. I’ve got to get working on other efficiency improvements, new kdesvn-build releases, classes, etc. It’s been fun writing though!

« Implementing a shared cache: Part 3Loading an icon with KIconLoader »