Notifications

92 views

Description

A good first step in investigating any issue where CI relationships are involved, such as Dependency Views (BSM Map), CI Relations formatter and Identification Engine errors, is to clean up the CI Relationship table.

It is possible for CI Relationship records [cmdb_rel_ci] to become invalid/corrupt in several ways:

  1. Parent, Child or Type references are Empty. There is no value in one or more of these fields.
  2. Parent or Child references  are invalid, in that they have a sys_id value entered, but no record exists for that sys_id in the CMDB
  3. Type reference value is invalid, and isn't in the CI Relationship Type table
  4. Duplicate records exist for the exact same Parent, Child and Type. 

This article aims to explain how to delete or correct these records, so that you can then discount them as a potential cause of your issue. I will also try to identify some of the reasons they may come about in the first place.

Release or Environment

Any.

Instructions

In the following examples, scripts can be used to do the cleanup. There is always a risk when running this kind of script that more gets deleted than plans. Please review the Rollback Context immediately after running these to confirm the scripts did update/delete only what you were expecting.

You will see relation_health_result  records cascade deleted when cmdb_rel_ci records are deleted. This is fine and to be expected.

1.  Parent, Child or Type references are Empty. There is no value in one or more of these fields.

You can identify these records from a List:

https://<instance name>.service-now.com/cmdb_rel_ci_list.do?sysparm_query=parentISEMPTY%5EORchildISEMPTY%5EORtypeISEMPTY

A relationship record must always have a value in all 3 fields to be valid and useful, and so any without are pointless and should be deleted.

var relCI = new GlideRecord('cmdb_rel_ci');
relCI.addEncodedQuery('parentISEMPTY^ORchildISEMPTY^ORtypeISEMPTY');
relCI.query();
relCI.deleteMultiple();

2.  Parent or Child references  are invalid, in that they have a sys_id value entered, but no record exists for that sys_id in the CMDB

You can identify these records from a List. This list works by finding fields with values, but that can't then be dot-walked into.  

https://<instance name>.service-now.com/cmdb_rel_ci_list.do?sysparm_query=parentISNOTEMPTY%5Eparent.sys_idISEMPTY%5ENQchildISNOTEMPTY%5Echild.sys_idISEMPTY

A relationship record with no valid CI either end is also pointless, and should be deleted.

Note: If the "Data Archiving" plugin is installed, then this list may also bring up references for CIs that have been Archived since the relationship was created. The correct resolution when either the Parent or Child CI is Archived is to also archive the relationship record at the same time. There is a problem involved when attempting to implement archive rules in this case though, as a rule can only take one reference field into account, and these records reference both a parent and child CI. For more information, see:
You may instead decide that relationship records are no longer required once a parent or child CI is archived, and so these could also be deleted.

The following script would delete these records:

var relCI = new GlideRecord('cmdb_rel_ci');
relCI.addEncodedQuery('parentISNOTEMPTY^parent.sys_idISEMPTY^NQchildISNOTEMPTY^child.sys_idISEMPTY');
relCI.query();
relCI.deleteMultiple();

3.  Type reference value is invalid, and isn't in the CI Relationship Type table

You can identify these records from a List:

https://<instance name>.service-now.com/cmdb_rel_ci_list.do?sysparm_query=typeISNOTEMPTY%5Etype.sys_idISEMPTY

I've kept these 'Type' broken references as a separate section, as correction of the value is usually the correct approach to these.

Some records may have a Type value of "1", because all instances originally provisioned before the London release have this field value defaulted to "1" if no type was selected/entered when the record was created. There used to be code that interpreted that as a 'Depends on::Used by' relationship type, but that code is no longer there, making this an invalid type. In this situation the "1" value should be replaced by "1a9cb166f1571100a92eb60da2bce5c5", which is the sys_id of the actual 'Depends on::Used by' relationship type  record.

Some records may have a truncated sys_id value for the type. Until the London release it was possible to cause this by clicking Save and Exit multiple time on the relationship editor. It is usually possible to figure out which type it was supposed to be from what remains of the value.

If a full sys_id is present but is missing from the relationship type table, especially if the relationship record was created by out-of-box code, then it may be that out-of-box relationship type records have been deleted in the instance, and those will need restoring. 

If the sys_id is not one of the out-of-box relationship types, then the cause could be instances that are not in sync with each other. CI relations records imported as XML from a different instance could have this issue, if the type doesn't exist on the target instance.

The following script will correct both the "1" values and truncated values in the Type field:

var relCI = new GlideRecord('cmdb_rel_ci');
relCI.addEncodedQuery('typeISNOTEMPTY^type.sys_idISEMPTY');
relCI.query();
while (relCI.next()) {
if (relCI.type.toString() == '1') {
relCI.type = '1a9cb166f1571100a92eb60da2bce5c5';
gs.log('Correcting cmdb_ci_rel sys_id:'+ relCI.sys_id + ', Replaced type=1 with type=1a9cb166f1571100a92eb60da2bce5c5','KB0780988');
relCI.setWorkflow(false);
relCI.update();
}
else if (relCI.type.toString().length < 32) {
var typeGr = new GlideRecord('cmdb_rel_type');
typeGr.addEncodedQuery('sys_idENDSWITH' + relCI.type.toString());
typeGr.query();
if (typeGr.next() && typeGr.getRowCount() == 1) {
 gs.log('Correcting cmdb_ci_rel sys_id:'+ relCI.sys_id + ', Replaced type='+ relCI.type +' with type=' + typeGr.sys_id ,'KB0780988');
relCI.type = typeGr.sys_id;
relCI.setWorkflow(false);
relCI.update();
}
else {
gs.log('Unable to correct cmdb_ci_rel sys_id:'+ relCI.sys_id + ', type='+ relCI.type + ' - No matching record for truncated type reference','KB0780988');
}
}
else {
gs.log('Unable to correct cmdb_ci_rel sys_id:'+ relCI.sys_id + ', type='+ relCI.type,'KB0780988');
}
}

4.  Duplicate records exist for the exact same Parent, Child and Type. 

This is not possible to identify from a list view. A script is required to identify and delete the additional records, leaving the original. The following script can be run in dry-run mode to check.

The following script is based on an internal KB Article (KB0744490) that makes use of GlideAggregate, and setup for this situation. This will do batches on 5000 at a time in order to avoid putting too much pressure on the instance memory, and so may need running multiple times.

Where CIs either side of a relationship being deleted are involved in a Service, Service Mapping Re-computation jobs will be expected to run to make sure the service model Layers/Environment remains in sync with the relationships and CIs. This is normal.

// Background script for Deleting duplicate records
// *********************************** SETUP ****************************************************************************************************************
var dryRun = true; // SET THIS TO false TO ACTUALLY DELETE RECORDS
var tableName = 'cmdb_rel_ci'; // TARGET TABLE
var encodedQuery = 'parentISNOTEMPTY^typeISNOTEMPTY^childISNOTEMPTY'; // APPLY ANY FILTER CONDITIONS HERE
var identifierFields = 'parent,type,child'; //USED TO IDENTIFY THE DUPLICATES
var batch = 5000 ; //SETS THE LIMIT FOR HOW MANY RECORDS ARE IN THE GLIDERECORD

// **********************************************************************************************************************************************************
var identifiers = identifierFields.split(','); //split up the groupby fields, we'll use this one later

var agg = new GlideAggregate(tableName);
agg.addQuery(encodedQuery);
agg.setLimit(batch);

// ADD THE COMBINATION OF UNIQUE IDENTIFIERS HERE
for(var i = 0; i < identifiers.length; i++) {
agg.groupBy(identifiers[i]);
}
// DON'T BOTHER READING ANY RECORDS THAT AREN'T DUPLICATE
agg.addAggregate('COUNT');
agg.addHaving('COUNT','>','1'); // only where 1 or more duplicates exist
agg.query();

// AT THIS POINT WE ONLY HAVE DUPLICATE RECORDSETS (WHERE THERE IS MORE THAN ONE RECORD WITH THE SAME IDENTIFYING FIELDS)
// NOW LOOP THROUGH ALL OF THOSE RECORDSETS, KEEPING ONLY THE OLDEST RECORD AND DELETING THE REST
var resultString = '\nUnique Set\tRecord to keep\tRecord(s) to delete\n'; // In order to avoid using many gs.print() statements we simply add all the results to a string and display them at the end.
var resultCount = 0 //Count the duplicates

while(agg.next()) {
resultString += getSetDetailsString(agg); //log the set that brought us here

var dup = new GlideRecord(tableName);
dup.addQuery(encodedQuery);
dup.setLimit(batch);

//get the records that match the set
for(var i = 0; i < identifiers.length; i++) {
dup.addQuery(identifiers[i],agg[identifiers[i]]);
}

// order by date ascending. the first record is the oldest and the one we want to preserve. Change for 'orderByDesc' to preserve the newest record and remove the rest
dup.orderBy('sys_created_on');
dup.query();

// skip the first record (original) and log it
dup.next();
resultString += (dup.sys_created_on + ' - '+ dup.sys_id + '\t');

// loop through and delete the remaining records for this set. Only deletes if dryRun = false
while(dup.next()) {
resultString += (dup.sys_created_on + ' - '+ dup.sys_id + ', ');
resultCount++;

if(!dryRun) { //CAREFUL
dup.setWorkflow(false);
dup.deleteRecord();
}
}

resultString += '\n';
}

//Display results in TAB SEPARATED FORMAT
gs.print('Duplicate count: ' + resultCount);
gs.print(resultString);

//helper method to dynamically create the SET string for logging
function getSetDetailsString(gRecord) {
var returnString = '(';
for(var i = 0; i < identifiers.length; i++) {
returnString += identifiers[i] + ': ' + gRecord[identifiers[i]] + ', ';
}
returnString += ')\t';
return returnString;
}

 

Additional Information

These are some of the known causes:

  • Until the London release, a new relationship record saved without a Type value would default to the value "1". This was the legacy sys_id for 'Depends on::Used by' relationship type, but the field in the dictionary retained this default value long after. This was fixed by PRB1251706 for new instances only. For customers upgrading, the default value needs clearing from the dictionary for this field manually.
    KB0743762: The CI Relationship record's Relationship Type reference field has a bad default value of 1
  • Until London (+KP8), PRB1273072 meant the Relationship Editor did not prevent the "Save and Close" button being clicked multiple times. This led to the strange effect of the sys_id value of the relationship type field being truncated. To prevent it make sure you click once and then wait for the form to close.
  • Until London (+KP5/JP9) we had a PRB1241891 where relationships to Application CIs were created by Application Patterns before ADM had created the Application CIs. That would leave a paretne or child value empty.
  • If Relationship Type records are manually deleted, then Cascade Delete will also delete the Relationship records, so that should not cause the issue. However if CIs and their relationships are Inserted/Imported via custom scripts or XML Import/Import Sets, and the Relationship Type does not exist, then these bad relationship records can be created that way.
  • If Data Archiving is used for the CMDB, then a CI could be archived and the Relationship records still remain. The archive rule needs to be set up with related records in the cmdb_rel_ci table also being archived at the time, but this is impossible to do both the Parent and Child reference fields at the same time due to PRB1296280, which will aim to remove the single 'reference element' restriction.
    KB0695364: It is not possible to archive all cmdb_rel_ci record when CI records are Archived, leading to CompactRelation errors when the CMDB Identification Engine fails to consider that a CI may be Archived and throws an error, causing major syslog table growth
  • Between loading the Relationship Editor, and clicking the "Save and Close" button, it is possible that the CI has been deleted by someone else for either the main CI or the related CI. A relationship record is added anyway, using the earlier sys_id values, without checking that the CIs still exist. Also, if the relationship editor form is loaded with an invalid or empty sys_id in the URL, then it would still let you add relationships to a non-existent CI, and cause bad records. Unlikely, but possible.

Article Information

Last Updated:2019-10-07 07:00:22
Published:2019-10-07