Press enter to see results or esc to cancel.

The Staging slot EventQueue issue

Say what?

Having the possibility of using staging slots when moving from on premises to the Cloud was one of the most promising and convenient improvements that could enhance the stability and uptime of our applications. Very exciting! But soon we ran into an issue, both with Cloud Services as with Web Apps:

Where we could replicate all Sitecore server roles or instances, the databases were still singular in their setup. This caused issues between the two active Content Managememt instances, because they shared the same EventQueue (which is in the database). And you cannot easily replicate these databases as well, because while running in the deployment slot, content editors could still add new content in the current production slot, which would be lost upon swapping both slots. A content freeze is not what we want, and difficult synchronization processes between the two databases while staging are not desireable, so we need to come up with a solution for the single database setup.

Please note that we run in staging mode for quite some time during the deployment process, for regression testing purposes. If you swap right away after a deployment, you may not experience or notice this particular issue.

Downsides of the staging slot

So what issues can we encounter when using the staging slot?

  • Updating or adding templates and layouts in the deployment slot also causes the production slot content to change. This is something you have to be aware of. And this is something I am not going to cover in this blog post (maybe later!). However, this doesn’t cause issues in most scenario’s, as your code is loosely coupled with your templates (missing fields should not cause exceptions and added fields are simply ignored). If you know you are going to introduce breaking changes in a release, using slot swapping isn’t going to be your friend, and you might need a maintenance window for that (or a content freeze and two sets of SQL databases).
  • You cannot do destructive tests in your deployment slot. Nor can you add test content. Because this would pollute your live database. Please be aware of this too.
  • Using the deployment slot, you will create two Content Management instances in your entire Sitecore installation, which gets you into trouble when publishing new content. And you have to, because your new templates need be published, in my case preferably using the auto publish feature of Unicorn after synchronizing my new content setup). Concurring CM instances both using the same EventQueue is troublesome, and that’s exactly what we need to fix to be able to really get the staging slot mechanism to be stable and reliable.

The good news is that your content editors can keep working on their content without risking losing anything.

Multiple CM instances

In a way, running a second group of Web App instances for all the different Sitecore server roles in the staging slot, while sticking to the same set of SQL databases, looks a lot like having multiple Content Management servers as described in the horizontal scaling scenario of Sitecore’s scalability options. So that’s the route I am going to follow for this blog post to solve the event queue issue.

If your CM environment contains two or more instances, you need to choose and configure one CM instance that will act as a publishing instance. This instance can still be used for authoring, and publish operations can still be initiated from any CM instance, but only the publishing instance will be responsible for executing the publishing task itself. Publish operations from all CM instances are queued, and the publishing instance picks them up as a sequence.

The instance name

So, first, Sitecore needs to be able to distinguish the multiple Content Management servers. On premises, and in earlier Sitecore versions, the instance name was empty by default, and its effective value was built up by concatenating the NetBIOS machine name and the name of the IIS Web site, separated by a hyphen character. This ensures it is always a unique name. But with Sitecore 8.2 Update 1, when provisioning using the default SCWDP packages, I saw the values are set by default within the scalability configuration file. I assume the previous technique isn’t valid anymore when running on Web Apps.

Mind that the instance name can be set in the App_Config/Sitecore.config via the following setting:

Or in the App_Config/Include/ScalabilitySettings.config via the following patch attribute:

The latter one being an overwrite of the first value. For clarity, and to group all scalability related configurations in one file, I would recommend only populating or changing the value in the ScalabilitySettings config and leaving the Sitecore.config value empty for this setting.

As you can see, the name is set to “sc8CM1” by default. When having only one CM instance in the production slot, that’s just fine. But when deploying to the deployment slot, we don’t want to have two instances with the same name. We need to change this value as long as our new version resides in the deployment slot, but the moment we swap slots, it has to be reset to “sc8CM1” again. And that, is our challenge.

The publishing instance

The publishing instance is also set both in the Sitecore.config and in the ScalabilitySettings.config like the instance name as shown above. And again, we will only focus on the patch attribute in the scalability configuration file. Let’s put our instance name in this patch attribute node:

Whatever the situation will be (currently only one live CM instance or currently running a second one in the deployment slot) you always want the CM in the production slot to be the publishing instance. So we only have to set this once (it is empty by default) and leave it like that ‘forever’. No tricks needed around deployments. If there is only a sc8CM1 instance, this parameter will point to itself when also populated with the “sc8CM1” value. And when there’s a CM instance active in the deployment slot, both instances will point to the CM instance of the production slot. So the value can be left like that for all configurations regarding the content management role.

The trick is in deviating the instance name of the CM instance in the deployment slot to prevent it from performing publishing operations.

Switching instance names

There are three reasons you do not want to change the instance name of the Sitecore CM instance running in the production slot: a) it doesn’t aid in differentiating the two CM instances, and b) it causes an application pool recycle to change it (for example from “sc8CM1” to “Sc8CM1-prod”), which harms your zero downtime deployment strategy, and c) it would require you to change the publishing instance value on both slots as well, which is unnecessarily complicated. So we are going to keep that setting unchanged in the production slot. The publishing instance can be left untouched as well, as explained in the previous paragraph. So all this boils down to one task: switching instance names of the CM instance in the deployment slot, while it is in staging.

In other words: how could we temporarily change the instance name, preferably without altering files that are deployed (for continuity reasons and following good CI-principles of using environment agnostic artifacts as much as possible)? That’s actually quite easy with Sitecore: we are going to add a temporary config file patching the instance name value and delete this extra file just prior to initiating the swapping!

Because I like to keep all config files vanilla in my Visual Studio solution (for easy upgrading and separating the default from the custom values), I am going to add the ScalabilitySettings.config file to my solution, translating the patch attributes using SlowCheetah, and adding a second config that re-patches the desired attribute during staging, only aiming at the CM role. We will name this new file “zzScalabilitySettings.config” to make sure it is processed after the original one.

But first, let’s transform the original patched attributes in the ScalabilitySettings.config for my CM instance publishing target in Azure:

This will name all CM instances “sc8CM1” and the publishing instance will always be “sc8CM1” too. Then, let’s add the new “zzScalabilitySettings.config” file:

Be sure to only put this file in the WDP for your CM instance, or to only add this in the transformation targetted at this server. The fun part is, that this is the initial situation of every new deployment from now on, so this file can always be present. Removing it simply neutralizes the publishing instance configuration.

Where the magic happens

To finish up our neat little trick of deploying a CM instance into the staging slot that uses the CM instance in the production slot as a publishing instance, we need to remove the temporary config file as described above, right before initiating the swap:

The -verb:delete and -dest:contentPath parameters are important to perfom the delete task of the specific content file, the rest of the command should be populated with your own access and authentication details.

This should solve any (concurrency) issues regarding the multiple CM instances you have during your deployment process. A simple fix for a complex issue I think. Please let me know if you encounter any other issues during the deployment and swapping process, or if this mechanism improved the stability of your Sitecore environment. It would be great to learn from each others experiences on this relatively new subject.

Comments

6 Comments

Anton Kuryan

But, this means, that you will swap into production a cold instance, which is in restarting state. So, does not seems to fit into zero downtime scenario.

Rob Habraken

Good question Anton! My advice on configuring a publishing instance during deployment doesn’t dictate to swap a cold instance. You could warm up the instance in your deployment slot before swapping, right after removing the zzScalabilitySettings.config file. As long as you keep the publishing instance configured during the installation of the new content and during the actual publishing of this new content. And don’t publish anymore while warming up your new instance.

António Ribeiro

Great article.

Just for clarification – in Azure WebApps with multiple CD’s and multiple CD’m, what are the recommended configurations settings for each role?

CD App_Config/Sitecore.config

CD App_Config/Include/ScalabilitySettings.config

???

???

CM App_Config/Sitecore.config

CM App_Config/Include/ScalabilitySettings.config

???

???

Thank you!

Rob Habraken

Thanks Antonio! The specific CM configurations regarding the Sitecore.config and ScalabilitySettings.config files are already described extensively in the article, I couldn’t clarify this any further. And no other changes should be made for this purpose than those mentioned above. The CD configurations for scaling out shouldn’t be altered; the eventqueue isn’t populated from the CD roles, so for those instances, running a staging slot instance simultaneously isn’t an issue. By the way, if you agree upon a content or publishing freeze during deployments, you wouldn’t need any changes to the configurations. This article only applies if you don’t want to have a content or publishing stop while deploying using a staging slot.

Patrick McNamara

If you delete the file zzScalabilitySettings.config from the slot wouldn’t the file need to be added to production (or vice versa) so that zzScalabilitySettings.config only exists on one of the CMs at a time?

Rob Habraken

Good point Patrick. Actually, the patch file is there during the use of both the production and staging slot. After swapping and deleting the patch file, you should either stop the staging slot and not use it, or you could delete the path file _after_ swapping to keep them different, but that would cold start production. Either ways, it is not ideal in my opinion, but I did not yet hear of a better alternative. You could alter Sitecore in such a way that you could use a slot setting to control this though, but that’s not possible out of the box.


Leave a Comment

Rob Habraken

Software Engineer, Technical Manager, Consultant, Sitecore MVP and overall technology addict. Specialist in web development, Microsoft technology and, of course, Sitecore.

https://www.robhabraken.nl