The goal of this article is to educate customers and provide comprehensive information on Data Manipulation. It includes unintended mass deletes or updates.
ServiceNow / ITIL Best Practises for data manipulation
Customers tend to delete the data from the instance if they feel they no longer need it. They sometimes delete the data directly in the Production instance without proper testing in the sub-production instance and end up deleting crucial or data that does not need to be deleted. This ends up the Production instance in a state of data loss, sometimes even making the instance unusable for the end-users because of no data in the instance (tables).
Deleting the data from the instance should be considered as a last option. If at all, the customer has no option but to only delete the data then, they should follow the ITIL recommended best practices.
- Perform a full clone of PROD over a sub production instance.
- Delete the data.
- Do a check on all the related tables and find out if the action has only deleted the intended data or has deleted other crucial data as well.
- If everything seems to OK, then follow the same process in your Production instance.
Customers should keep in mind the cascade delete rules that come into play when we delete the data. As per the cascade delete rules configured, the related data may or may not be deleted. If you are unaware of the cascade rules, please go through the below documentation.
If you are manipulating the data through a script, always have a test run with a confirmation on how many records will be manipulated. You can use getRowCount method for the same.
Reasons For Unwanted Deletes/Updates
1. Cascade delete rules: When a record is deleted, there are different options for how the deletion will affect records that reference the deleted record. You can configure what happens to records that reference a record when that record is deleted. The customer needs to review the cascade delete rules before any data is deleted.
2. Bad query in the script: One of the most common ways that the customer manipulates the data is through scripts. But if the query in the script is not correct, then it might end up manipulating all the data that is unintended to be deleted. In order to avoid this problem, customers can check the number of records that will be processed with the query in the script before actually deleting the records.
An incorrectly constructed encoded query, such as including an invalid field name or a spelling error produces an invalid query. When the invalid query is run, the invalid part of the query condition is dropped, and the results are based on the valid part of the query, which may return all records from the table. Using an insert(), update(), deleteRecord(), or deleteMultiple() method on bad query results can result in data loss.
Note: If you are manipulating the data using deleteMultiple() or updateMultiple(), make sure the script is not in a loop and take extra caution to test the script.
var test = new GlideRecord('problem');
In the above scenario, the script would wrongly update all the records because of a spelling error in the 2nd line. This scenario could be avoided by doing a test run with the below script:
var count = 0;
var test = new GlideRecord('problem');
gs.print("Number of records : " + count);
Deleting the data from the instance would not clean up any memory/space in the instance. Below are few of the alternatives that the customer can contemplate instead of deleting the data from ServiceNow.
Alternatives Of Deleting The Data
Please consider alternatives to deleting the data. Below are a few of the techniques but you may have your own methods:
- If there is an option to make the record inactive, this is the best approach that the customer can follow. Make the record inactive and have a proper filter condition to exclude the inactive records.
- Archive the records that are not required anymore. This is again one of the best approaches since the data is not getting deleted from the instance and you will not have the data in the actual table anymore. Once the data is deleted you cannot get it back. If the data is archived you can refer to the archived tables in the future if required.
- Append any special character as a prefix to the list of records that are not required and filter out the records using a filter condition.Ex: append "zz" as prefix and add the condition "does not start with zz".
The only area which gets impacted with the table being huge is reporting or retrieving a record from a huge table. This can be addressed by giving proper conditions.
Unexpected Consequences of Delete Operation
1. Cascade deletions
When a record is deleted, there are different options for how the deletion will affect records that reference the deleted record. You can configure what happens to records that reference a record when that record is deleted.
For example, when a configuration Item record is deleted, all tasks referencing that CI are also deleted. To help a user understand that a cascade delete is going to be executed, a Warning! the message is displayed before the user is allowed to proceed with the Delete, All tables where data will be deleted are listed in this message. Administrators and Developers should be able to recognize and understand the implications of a deletion that displays this message.
Developers should keep in mind, the deletion activity will be added to the update set and will be promoted to production where it will delete all records that match the deletion criteria.
Note: Documentation on Cascade Delete Rules can be found in the standard ServiceNow documentation at https://docs.servicenow.com/bundle/london-platform-administration/page/administer/field-administration/task/t_CascadeDeleteRules.html
Example 1: KB0715787 - Unexpected Consequences when Catalog Variables are Deleted
Example 2: Deleting user/s record can delete the related groups, roles and other tables
i. Use the Active flag, across the platform, instead of any Delete options.
ii. Recognize the Warning! dialog and it's meaning
iii. Understand Cascade Delete Rules
When a single record is deleted, hundreds or even thousands of other records will be updated to remove any references to the deleted record. This is the most general issue we see with almost all the data loss cases. The best policy is that data should never be deleted from the platform, whether it is through the UI or through a script. Use the Active Flag. Deletions can cause catastrophic DL events.
Workflows are usually tied to task type records. Unintended deletions can update WFs incorrectly. There is no direct method to correct the WFs immediately, It requires multiple tables to restored to set the WFs incorrect state. In such scenarios, the support team needs to restore them manually from SOT instances which is time-consuming to recover the WFs/approvals.
SLAs (task_sla) are attached to task records, Unintended task deletion or state update can cause SLAs to be set in an incorrect state. There is no direct method to recover the SLA records. We need to perform the restore manually to recover these records. This is a time-consuming process.
Recovering the lost data
If you want to restore / undelete a record, please follow the below documentation. It is very important to consider the references option as well when you undelete any record. Please reach out to Technical Support if you have any queries.
Starting from London, we have a new feature called Rollback and delete recovery which helps in restoring the data that has been manipulated. It is a more advanced version of the Undelete option. The tool is continuously evolving and we have been trying to address any issues that we find in it. If you can find a related rollback context for your unintended data manipulation transaction, you can rollback it. The tool is more advanced in New York and above versions and you should be able to roll back it directly in PROD.
Use this option with caution. The Delete Recovery option is always the quick, safest and best way to restore the data.
But in case you need any suggestions or need any help you can always create a case when the data is deleted and it needs to be restored, it involves a separate team intervention to restore the data and it would be a challenging task if there are no proper details or when it is not reported immediately. The first thing that needs to be done when you notice data loss is to contact the Customer Support so that they would be the best to suggest the next steps before you start to do the restore yourselves.
KB0748445 details you on how to track the deletes and troubleshoot the issues.
Data Restoration Process for Production instances
In a situation where data loss occurs on the customer's production instance, we follow the below process:
We will provision 2 new temporary instances
- One for a standard backup restore or 'Point in Time Restore' prior to the data loss to retrieve the missing/affected records. This will be referred to as the Source of Truth instance. It will represent the state of the customer’s instance before the data manipulation occurred. This will be only available if the case is brought to our attention within 3 days of the data loss event.
- One to test the solution to restore data. This will be referred to as the Test Bed instance. This instance will be a copy of the affected PROD instance post deletion event.
It would require some time for Technical Support to determine the exact time at which the data manipulation occurred. We would need this timestamp to request for the Point in Time Restore. Once, the time is determined and we request for the temporary instances, it would take time to provision the instances based on your instance size. If the instance is too large, it would even take a day or more to get the instance provisioned.
We do not restore the data if the data loss occurred in the sub production instance. Please go through KB0813303 - Handling Data loss in Sub-Production Instances.
So, the conclusion would be not to make your instance in a state of data loss and try as much as you can to prevent data loss.