paint-brush
DIY Tagged Cacheby@ayacaste
259 reads

DIY Tagged Cache

by Anton MusatovDecember 9th, 2024
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Developers often joke that programming has two main challenges: naming variables and cache invalidation. This joke is not far from the truth: managing caches, especially their invalidation, can indeed become a serious task. In this article, I’ll explain how to easily implement tagged cache functionality based on an existing caching service. Imagine we have a system where users add articles. For each user, we display statistics about their articles in their personal dashboard: the number of articles, average word count, publication frequency, etc. To speed up the system, we cache this data. A unique cache key is created for each report. The question arises: how to invalidate such caches when data changes?
featured image - DIY Tagged Cache
Anton Musatov HackerNoon profile picture


Developers often joke that programming has two main challenges:

  • naming variables
  • cache invalidation


This joke is not far from the truth: managing caches, especially their invalidation, can indeed become a serious task. In this article, I’ll explain how to easily implement tagged cache functionality based on an existing caching service.


Imagine we have a system where users add articles. For each user, we display statistics about their articles in their personal dashboard: the number of articles, average word count, publication frequency, etc. To speed up the system, we cache this data. A unique cache key is created for each report.


The question arises: how to invalidate such caches when data changes? One approach is to manually clear the cache for each event, for instance, when a new article is added:

class InvalidateArticleReportCacheOnArticleCreated {
    public function handle(event: ArticleCreatedEvent): void {
        this->cacheService->deleteMultiple([
	          'user_article_report_count_' . event->userId,
	          'user_article_report_word_avg_' . event->userId,
	          'user_article_report_freq_avg_' . event->userId,
        ])
    }
}


This method works but becomes cumbersome when dealing with a large number of reports and keys. This is where tagged caching comes in handy. Tagged caching allows data to be associated not only with a key but also with an array of tags. Subsequently, all records associated with a specific tag can be invalidated, significantly simplifying the process.


Writing a value to the cache with tags:

this->taggedCacheService->set(
    key: 'user_article_report_count_' . user->id,
    value: value,
    tagNames: [
        'user_article_cache_tag_' . user->id,
        'user_article_report_cache_tag_' . user->id,
        'user_article_report'
    ]
)


Invalidating the cache by tags:

class UpdateCacheTagsOnArticleCreated {
    public function handle(event: ArticleCreatedEvent): void {
        this->taggedCacheService->updateTagsVersions([
            'user_article_cache_tag_' . user->id,
        ])
    }
}


Here, the tag 'user_article_cache_tag_' . $user->id represents changes in the user's articles. It can be used to invalidate any caches dependent on this data. A more specific tag 'user_article_report_cache_tag_' . $user->id allows only the user's reports to be cleared, while a general tag 'user_article_report' invalidates report caches for all users.


If your caching library does not support tagging, you can implement it yourself. The main idea is to store the current version values of tags, as well as for each value tagged, to store the tag versions that were current at the time the value was written to the cache. Then, when retrieving a value from the cache, the current tag versions are also retrieved, and their validity is checked by comparing them.


Creating a TaggedCache class

class TaggedCache {
   private cacheService: cacheService
}


Implementing the set method for writing to the cache with tags. In this method, we need to write the value to the cache, as well as retrieve the current versions of the tags provided and save them associated with the specific cache key. This is achieved by using an additional key with a prefix added to the provided key.

class TaggedCache {
   private cacheService: cacheService

    public function set(
        key: string,
        value: mixed,
        tagNames: string[],
        ttl: int
    ): bool {
        if (empty(tagNames)) {
            return false
        }

        tagVersions = this->getTagsVersions(tagNames)

        tagsCacheKey = this->getTagsCacheKey(key)

	      return this->cacheService->setMultiple(
            [
                key => value,
                tagsCacheKey => tagVersions,
            ],
            ttl
        )
    }

    private function getTagsVersions(tagNames: string[]): array<string, string> {
        tagVersions = []

        tagVersionKeys = []

        foreach (tagNames as tagName) {
            tagVersionKeys[tagName] = this->getTagVersionKey(tagName)
        }

        if (empty(tagVersionKeys)) {
            return tagVersions
        }

        tagVersionsCache = this->cacheService->getMultiple(tagVersionKeys)

        foreach (tagVersionKeys as tagName => tagVersionKey) {
            if (empty(tagVersionsCache[tagVersionKey])) {
                tagVersionsCache[tagVersionKey] = this->updateTagVersion(tagName)
            }

            tagVersions[$tagName] = tagVersionsCache[tagVersionKey]
        }

        return tagVersions
    }

    private function getTagVersionKey(tagName: string): string {
        return 'tag_version_' . tagName
    }

    private function getTagsCacheKey(key: string): string {
        return 'cache_tags_tagskeys_' . key
    }


Adding the get method to retrieve tagged values from the cache. Here, we retrieve the value using the key, as well as the tag versions associated with that key. Then we check the validity of the tags. If any tag is invalid, the value is deleted from the cache and null is returned. If all tags are valid, the cached value is returned.

class TaggedCache {
    private cacheService: cacheService

    public function get(key: string): mixed {
        tagsCacheKey = this->getTagsCacheKey(key)

        values = this->cacheService->getMultiple([key, tagsCacheKey])

        if (empty(values[key]) || empty(values[tagsCacheKey])) {
            return null
        }

        value = values[key]

        tagVersions = values[tagsCacheKey]

        if (! this->isTagVersionsValid(tagVersions)) {
            this->cacheService->deleteMultiple([key, tagsCacheKey])

            return null
        }

        return value
    }

    private function isTagVersionsValid(tagVersions: array<string, string>): bool {
        tagNames = array_keys(tagVersions)

        actualTagVersions = this->getTagsVersions(tagNames)

        foreach (tagVersions as tagName => tagVersion) {
            if (empty(actualTagVersions[tagName])) {
                return false
            }

            if (actualTagVersions[tagName] !== tagVersion) {
                return false
            }
        }

        return true
    }
}


Implementing the updateTagsVersions method to update tag versions. Here, we iterate over all the tags provided and update their versions using, for example, the current time as the version.

class TaggedCache {
    private cacheService: cacheService

    public function updateTagsVersions(tagNames: string[]): void {
        foreach (tagNames as tagName) {
            this->updateTagVersion(tagName)
        }
    }

    private function updateTagVersion(tagName: string): string {
        tagKey = this->getTagVersionKey(tagName)

        tagVersion = this->generateTagVersion()

        return this->cacheService->set(tagKey, tagVersion) ? tagVersion : ''
    }

    private function generateTagVersion(): string {
        return (string) hrtime(true)
    }
}


This approach is both convenient and universal. Tagged caching eliminates the need to manually specify all keys for invalidation, automating the process. However, it requires additional resources: storing tag version data and checking their validity with each request.


If your caching service is fast and not heavily constrained in size, this approach will not significantly affect performance. To minimize the load, you can combine tagged caching with local caching mechanisms.


In this way, tagged caching not only simplifies invalidation but also makes working with data more flexible and understandable, especially in complex systems with large amounts of interconnected data.