Cache all CakePHP find queries

I have been adding some cache to Infinitas recently and found it rather difficult to cache all find queries without having to overload and rewrite the entire Model::find() method. While its not the most complicated method in the world, its not DRY rewriting it and some time in the future its likely to bite me in the arse when something changes.

A basic way to cache the queries is as follows:

  • Create a hash of the query array
  • Check the cache using the hash as a key
    • If the cache exists return it
  • If the cache did not exist fetch the results
  • Save the results using the has as a key
  • Return the results

The problem comes when using custom find methods as the call is done in two stages. First CakePHP will call the method with the $state set to before which is where query params can be processed and manipulated. Cake will then run the actual query and run the custom find a second time with the $state set to after, which is where the actual results are processed.

The way find was set up it was not easy to cache based on the query params as they were not easily accessible after the custom find query had manipulated the query. So using the ideas from above the hash of the query my not be correct as the custom find methods before calls may have modified the query.

I submitted a patch (and its merged!) which abstracts the find out a bit and makes it much easier to cache find queries now. Here is an example of how to cache all find queries (this is probably not a good idea as you could end up with a cached version of every page of pagination, sorted by ever column with every limit possible). All that needs to be done now is overload the _readDataSource() method where CakePHP fetches the data and check for cache.

protected function _readDataSource($type, $query) {
	$cacheName = md5(json_encode($query));
	$cache = Cache::read($cacheName, 'cache-config-name');
	if ($cache !== false) {
		return $cache;
	}

	$results = parent::_readDataSource($type, $query);
	Cache::write($cacheName, $results, 'cache-config-name');
	return $results;
}

So that is pretty simple having to only overload one method where the data is fetched from the data source. Everything in moderation so its probably best to limit what is cached. Filtering out a few cases like that means you wont need 4 million gigs of ram for cache and only the less frequent queries would not be cached.

Some things you might want to skip

  • Results that are paginated (possible that queries have the page param here.
  • Results sorted differently from the default
  • Results with a custom limit (eg: limit != 20)
  • Results with a high limit (eg: limit >= 100)

Its also easy to cache only queries that are find('list') for example using $this->findQueryType as a check. Another good idea is to check for $query['cache'] and then in your code you can do something like the following (the overloaded method would need to be modified to take these options into consideration, the example above does not):

Don't cache this query
$this->find('all', array(
    'cache' => false
));
Custom cache config
$this->find('all', array(
    'cache' => array(
        'name' => 'other_cache',
        'duration' => '+2 weeks'
    )
));

real34on 17/2/13

This is nice!

I have 2 questions about this:
* how do you manage cache invalidation effectively in this case? Since the cache key is not prefixed with the model I do not see how to do by adding code in the `Model::_clearCache()` method for instance
* have you considered an alternate solution at datasource level? What were the limitations? Because that it why the `Model::cacheQueries` attribute is for. So to make this cache persistent over requests it seems that just minor hacks could be done to DboSource::_writeQueryCache / getQueryCache (or using PHP magic functions to overload access to `_queryCache`