This is part two of my blog series on Data Privacy, accompanying my Sitecore modules developed for automatic disposal of personal data in Sitecore.
Read part one if you’re new to this topic.
An overview of Sitecore databases
This is an overview of all non-content databases of a typical Sitecore installation. Mind that Cortex and the Universal Tracker do store personal data as well, but only temporarily. And this overview assumes you did not customize the behavior from the default behavior of Sitecore.
The processing, reference and reporting databases all contain references or aggregated data only, but not the actual (personal) data. The marketing automation plan database contains the plans and configuration itself, but not the data that goes through the plans. So we don’t have to worry about those. The other three however, do store personal data: the xDB Collection database (Experience Profiles), the EXM database (email addresses) and the Forms database (form submissions). The first two are fully covered by the
ExecuteRightToBeForgotten() (or ‘Anonymize’) feature. But not the Forms database!
Storage is optional
I should add that storing form submissions is optional. If you only handle the submitted data externally (e.g. via mail or sending it to an external API) you obviously do not have to clean up anything. But if you do store form submissions using the ‘Save Data’ submit action, you need to take care of the disposal of any personal data within the form submissions yourself.
Since Sitecore 9.3, you have the option to delete form data from the dashboard in Sitecore Forms. This feature however, only gives you the option to delete the entire list of form submissions of a certain form, or the entries within a specific data range.
This makes it impossible to delete specific form entries, following up a specific retention period or cleaning data of a specific Contact in xDB. And there’s another issue: there is no link between a form submission and a Contact in xDB!
The Sitecore documentation tells you that if a form submission is linked to an identifier such as a contact identifier or an email address, you can use SQL to clear or delete rows. It also demonstrates that the reporting database stores metrics about form submissions, linking a Contact ID to a Form Entry:
That’s perfect! If it would be true…
When diving in the databases myself, I found out that the reporting database actually stores the FormID, not the FormEntryID. This means it doesn’t tell you which entry belongs to a certain contact, only what form he used to submit this data. And this doesn’t really help in cleaning up the data when you want to execute the right to be forgotten on a contact and include any form submissions a visitor made.
Let’s get this fixed!
- We need to establish a link between an xDB Contact and a Form Entry
- We need to remove the data upon executing the Right to be Forgotten
After I reported what I think is a documentation error, I started figuring out how to fix this myself. Using an elegant and universal approach. First of all, you cannot access Form data via the Sitecore API like you would do when interacting with Sitecore content items from code. So we do need SQL to solve the puzzle. And secondly, while extending the ExperienceForms SqlFormDataProvider class sounds like a decent plan, it actually would require you to add a whole bunch of assemblies to your xConnect deployment (including the Sitecore kernel). So I reverted to a simple custom class with some plain SQL in it:
Again, you can download this implementation as a handy Sitecore module, which is tested on Sitecore 9.1 update 1 and above. It contains the custom activity type and the custom save action. You can also use my example implementation on the Habitat Corporate project as a starting point for your own custom implementation, as this is mainly an example: maybe you want to base the link upon the user’s email address. Or maybe you want to use another trigger to clean up this data. Anyhow, checkout my GitHub repository at https://github.com/robhabraken/data-privacy to get started!
And then, only two days after publishing my Sitecore modules and demo code, I got a very clever Pull Request from fellow MVP Anton Tishchenko: an xConnect Service Plugin utilizing my link between a Contact and a Form Entry to automatically perform the
DeleteFormEntries() method whenever the Right to be Forgotten is executed! After having done a few revisions together back and forth, we got it to work flawlessly together with my module and now, you do not even need to use the ‘EraseFormSubmissions’ action in your marketing plan anymore. But more importantly, it ensures that your form data is cleaned up regardless of the method being used. Fully automatically. So it doesn’t matter if this is done manually via the Experience Profile dashboard by a marketer, or via custom code or via the marketing automation plan we’ve created.
Install my EraseFormSubmissions module, and the default ExecuteRighToBeForgotten method will also cover your Forms data, 100% automatically!
So all in all, we have one less thing to worry about in cleaning up our visitor’s data. And it always feels good to automate things, even more when done via extending the platform using its core features!
If you want to learn more about how to add custom fields to your form submissions, checkout this blog post https://visionsincode.wordpress.com/2018/12/09/add-hidden-data-to-sitecore-forms-without-hidden-fields/.
And for an overview of where personal data resides in Sitecore and how to trace it, read https://doc.sitecore.com/developers/93/platform-administration-and-architecture/en/data-subject-entities-in-your-sitecore-implementation.html.