812 views

Description

This performance issue was identified during an upgrade, and is due to the length of time it can take to add a new column with a default value (that is NOT prefixed with "javascript:".) to large tables (100K+). In addition to the time spent creating the new column, a re-run is necessary through all of the data again to apply the default value.

 

Steps to Reproduce

  1. Pre-Kingston instance
  2. Get a large table containing 100k rows or more.
  3. Add a column to the table with a default value:
    • It's important that the default value is specified when the column is added, not after.

Based on the size of the table this operation can take an excessive time to add the column and then go back through and apply the default value.  In some cases, more than a day.  When this operation is during an upgrade it can cause the upgrade to take a very long time and  give the appearance of not making progress.

Criteria for determining pre-Kingston instances at risk:

  • Planning to upgrade from Helsinki, Istanbul, Jakarta
  • The em_alert_history table contains 1M records or more
  • The following columns do not exist and are added on upgrade: correlation_group, correlation_rule_group, group, group_source, impact_services, is_group_alert.

Workaround

 

Note: The following steps should be performed first on a sub-prod instance before doing so in production. This issue can affect all large tables on upgrade if a new column is added, however, the above table/columns have been identified by ServiceNow as affected by this problem and leading to potential risk of a long upgrade.

  1. Download the following attachments: sys_auto_flush_62c63f08930002000fb3b9ab357ffb4a.xml, sys_remote_update_set_f6d841a6db5bb200b9d5b701ef96198b.xml
  2. Clean up un-wanted em_alert_history records to reduce the size of the table:
    • Add table cleaner on em_alert_history by importing the sys_auto_flush record attached: sys_auto_flush_62c63f08930002000fb3b9ab357ffb4a.xml
      • Navigate to /sys_auto_flush_list.do, and right click on top of list and Import XML, then import sys_auto_flush_62c63f08930002000fb3b9ab357ffb4a.xml
  3. Trigger the Table Cleaner scheduled job so the scheduled cleanup of em_alert_history is triggered. See the set-up in Step 1
    • Navigate to /sys_trigger_list.do?sysparm_query=GOTOname%3DTable%20Cleaner, open the Table Cleaner scheduled job, and then right-click on the record and Execute Now
      • Alternatively, this job runs once an hour, may allow the scheduler to pick it up and run without intervention. If this is the case, move to the next step to confirm that the job finished running.
    • The clean up of em_alert_history table may take approx. 2 hours. Confirm that Table Cleaner job finished processing:
      • Navigate to /syslog_transaction_list.do?sysparm_query=sys_created_onONToday%40javascript%3Ags.daysAgoStart(0)%40javascript%3Ags.daysAgoEnd(0)%5Eurl%3DJOB%3A%20Table%20Cleaner and then verify that there is a recent record for JOB: Table Cleaner showing that the Table Cleaner job run above completed.
  4. Apply the attached Update Set sys_remote_update_set_f6d8cd66db5bb200b9d5b701ef9619e3.xml adding 6 new columns to em_alert_history. This takes time for the Update Set to be committed as the em_alert_history table is altered to add the new columns.  This runs in the background without impact on the instance functionality.
    • In testing, adding these columns to an em_alert_history table with 10M records took an hour.
  5. After new columns are added and update set completes, go forward with the updated to Istanbul or Jakarta.

 

Additional notes:

  • What data is in the em_alert_history table that is being cleaned? the data is this table stores historical information about alerts and is not used in the UI or impact calculation after 7 days.
  • What is being done in the future to prevent this issue? The table cleaner is being implemented via the workaround, but will also be implemented by a fix included in a future release.

Outside of the identified columns/tables above, there is no workaround for this issue as it is caused by the underlying operations that add a column. The process has been improved in Kingston, where this problem does not occur.

 


Related Problem: PRB1028261

Seen In

Istanbul Patch 3a

Fixed In

Kingston

Associated Community Threads

There is no data to report.

Article Information

Last Updated:2018-07-02 10:42:26
Published:2018-07-02