Sitecore Content Hub does not provide built-in functionality to track asset usage in the Sitecore CMS. When using the Content Hub CDN to serve images directly to a Sitecore XP website, this creates the risk of deleting assets that are still in use, resulting in broken images. And as a result, content editors become hesitant to delete images and clean up their media library.
There are some examples of add-ons that implement this functionality, but most of them offer insight into asset usage in the form of an External Page Component that retrieves the usages upon request: click a button and see a list of all content items that reference the current asset. Although this is already helpful, the architecture of such solutions does not allow automation based on this data and requires manual actions from users to retrieve the information.
In this article I describe a concept for tracking asset usage in real time by pushing usage updates from Sitecore to Content Hub through a resilient, cloud-native architecture.
The ideal solution
What I want is real-time insights, not looking up the usage upon request. Why? Not only because it’s more convenient from a UX perspective, but mostly because I want to use this data within workflows too (like Content Hub Actions): if an asset is used, I want to block unintended deletion, and give the content editor a choice whether to delete this asset or cancel the delete operation. Just like we’re used to within the Sitecore CMS. This preserves content integrity and helps a great deal with proper content management. And as a bonus, Content Hub editors now feel more comfortable to keep their asset library up-to-date and confidently remove old data.
The challenge
If I want to use the usage data (a list of all content items that reference a certain asset) in any workflow or automation, it needs to be part of the stored data in Content Hub, it needs to be part of the native asset data within the M.Asset schema, and it needs to be up-to-date for any asset at any given time. This means that if anything changes in the CMS (for example when a new asset is referenced or a reference is removed), the CMS needs to notify Content Hub that the usage of that asset has changed. So instead of retrieving the usage from Content Hub, consulting the Sitecore CMS indexes, we need the CMS to actively update the actual asset data in Content Hub proactively. We need to move from a pull-concept to a push-concept.
There are, however, two issues with this approach. The first one is that I don’t want to create a direct dependency between Content Hub and Sitecore CMS: if Content Hub API is unresponsive or down, or if I have bulk updates of any kind, my update call could get lost and my usage overview becomes corrupt. I don’t want performance dependencies between those two systems. And the second reason is that when I respond to an item save event in the Sitecore CMS, I only can see the new state of the item – if a user isn’t using item versions, I don’t know which former asset references are now removed and I can only add a reference to the new asset in Content Hub, but not remove the old asset that was referenced before the save action.
I am aware that Sitecore throws both an item:saved and an item:saving event, of which the latter contains the values before and after the save – but in my case I am actually only interested in publishing events, as merely saving a new version doesn’t change the asset usage on the live website, that’s only affected by publishing events, and publishing a new version of content is a bit more complex: Sitecore item versions aren’t used very often, sometimes an item is new, sometimes an item is unpublished, deleting the reference, but the actual content item didn’t change at all in the master database of Sitecore. So we need to look at the bigger picture and preferably also make a solution that is agnostic to the specific architecture of Sitecore XP, which also opens the door to newer concepts like Edge publishing in SitecoreAI or maybe even the yet-to-be-released new content concept for SitecoreAI that throws the whole publishing concept overboard entirely.
The architecture behind the idea
To solve both these issues, I came up with a cloud-native solution that introduces a number of middleware components to bridge the gap, connecting Content Hub to Sitecore CMS and vice versa.
The first component is a data store that keeps track of a list of CMS content items referencing Content Hub assets, like a join table in SQL. This allows me to see the current state and the delta with a new state when responding to an item being published in Sitecore CMS. Regardless of the Sitecore CMS setup and architecture, I now simply can see if a published item uses an asset or not, and look up if that was the case before, allowing me to add a reference or remove a reference accordingly.
The second component is a microservice that handles the logic I just described, determining the delta and deciding which Content Hub asset it needs to update.
And the third component is an Azure Service Bus Queue that disconnects the two end systems Content Hub and Sitecore XP, removing any live dependencies and thus protecting my data integrity: whenever a publish event is fired from the Sitecore CMS, I can queue this event in Azure, and let my microservice create the required follow up actions, and only when the API calls to Content Hub succeeded, remove the event from the queue.
Sitecore integrations
Of course, next to the middleware to create a resilient and scalable solution, I also need to add functionality to both Sitecore XP and Sitecore Content Hub. Sitecore XP needs an extension that responds to item publish events and extract the URLs of the referenced Content Hub assets from those items and puts them on the queue accordingly. And within Content Hub, I need to extend the Schema of M.Asset to store the asset usage natively, add an External Page Component to display the usage, create an action that responds to asset delete actions, and a UI component that informs the user (content editor) whenever they want to delete an asset that is currently in use.
And then there’s one more thing: whenever you would start using this solution to track usage across your existing Sitecore solutions, you would need to do an initial cross-reference check and set up the intermediate data storage and populate all Content Hub assets with the current usage. So we need some kind of import tool that does a one-time scan to populate all resources required to make this tool functional, before we can start to respond to new publishing events coming from the Sitecore CMS.
The execution
I’ve had this idea for a while, and it was on my to-do or wish list for quite some time now, but as you can read above, it’s quite an elaborate setup, so I kept postponing the work, as these open-source projects are spare-time efforts for me personally.
At iO, we continuously invest in our collaboration with colleges and universities via collective projects, guest lectures and internships, and I was lucky to find two highly skilled Software Engineering students for an internship who told me they were up for a challenge. I offered them the above as an internship project and they absolutely went above and beyond, turning this concept into a fully working and well-documented extension on both Sitecore solutions. Since we are going to use this ourselves for one of our customers that’s currently on Sitecore XP 10.4, the solution is fully focused on Sitecore XP (and would work too for Sitecore XM). A future version of the tool will also be made available for the SitecoreAI CMS (f.k.a. XM Cloud).
In our next collective blog post, you can read how Pasha and Darlon transformed my concept into a practical and super useful extension to both Sitecore XP and Sitecore Content Hub.

Leave a Reply