Objective of this KB is to answer a few frequent requests/questions Tech support receives in related to the upgrades. Please append the generic questions in the comments so that we can keep this KB updated.
This information is generic and if you have follow-up questions, please create a ticket to us.
The customisations on your instance can affect the below behaviour and this is from an OOB perspective.
1.How to Schedule/Modify/Cancel the upgrade Change in HI?
2.Patching program and End of Life upgrade program KB’s
End of Life aka Unsupported Release Family Upgrades FAQ ==> https://hi.service-now.com/kb_view.do?sysparm_article=KB0610454
3.What is the best practice to be followed for upgrades?
This is a generic YouTube video explaining process of planning/testing related to upgrades.
4.Why my upgrade didn’t start and it went past the planned start time on the HI change?
Upgrade is triggered by the "Upgrade" Job on the instance which runs in an hourly interval(OOB) and this job check HI instance record every hour for Assigned WAR version.
When it find a new Assigned WAR version, it will download the new WAR version.
***Instance upgrade can take up to 1 hour to trigger on the instance from the planned start time of the HI Change.
Actual Work start time is updated on the HI Change when the actual upgrade is triggered on the instance.
You can check the *upgrade jobs by going to the below URL.
Please make sure the value on System ID is blank on the "Upgrade" Job. If there is a value, please make sure that this is selected to --None--
Hardcoding the System ID is a not a recommended approach and there are chances that the node selected can be a far node (from secondary DC in case of AHA) and the upgrade can run for long.
Upgrade job will download the WAR file and upgrade the node where this job is triggered. This node is then restarted and the second job "Check Upgrade Script" will be triggered on System start-up and will continue the Upgrade. Both these jobs run on worker thread.
Modifying the Next action time of the "Upgrade" job after it went past the "Planned start time" of the HI Change is not a recommended approach as this can create issues.
Please plan your change "Planned start time" about 10-15 minutes before the run time of this job. Or you can adjust the Next action time (minutes), may be 2-3 hours before the "Planned start time" of the change so that it is in lieu with the start of the change and it could run few cycles successfully.
5.What is the best process to DRY RUN my upgrade?
To record the correct timelines/behaviour there are 2 options:
• Option 1: Initiate a clone from PROD to SUBPROD including (Default is EXCLUDE) attachments, audit and log and then upgrade the SUB PROD instance. Including the above specified tables are very important as these are the largest tables on an instance and index creation/schema changes to this table can take a major chunk of the total upgrade time. If they are excluded, that time cannot be factored into the planned upgrade activity. From London, during a clone you have an option to select the amount of data on task tables( Default is 90 days). Please make sure you are selecting "Full" as we need the full task table to analyse the correct timelines.
• Option 2: Request customer support for a restore of your PROD instance to the SUB PROD instance so that the SUB PROD instance can be an exact replica. SUB PROD can be upgraded and this should give you a definite ETA on the upgrade run time.
Upgrades should be DRY RUN multiple times. If we do a proper testing of upgrade on SUB PROD’s the upgrade on PROD will go fine and the support is available 24X7 to look into any unforeseen issue.
6.How long will my upgrade run?
7.What happens on my instance during an upgrade?
Record creation and records created will not be impacted by the upgrade. Only the Scripts and the Schema will be modified via upgrade.
One node will be running the whole process by upgrading the node then triggering the upgrade of the Schema/DB. In parallel, it will run the upgrade on other nodes and after the upgrade, nodes will be restarted resetting the connections.
So once the nodes are restarted, the nodes will be in newer Version and the DB will not be in newer version till the Database upgrade is completed. Even though record creations are not interrupted; my personal recommendation will be to restrict users during the upgrade. By this the system can use all the available resources for the upgrade activity.
8.Can we take an ad-hoc backup prior to the upgrade?
We take separate backups for Primary and Secondary DB (where available).
The backup cycle consists of four weekly full backups and the past 6 days of daily differential backups that provide 28 days of backups. All backups are written to disk, no tapes are used and no backups are sent off site. All the controls that apply to live customer data also apply to backups. If data is encrypted in the live database then it will also be encrypted in the backups."
More information on the Backups and Retention is covered in the below white-paper.
9.What are my options to recover my data in case of Data corruption post upgrade?
Point in time restore is done by restoring the instance to the last available backup and recreating the remaining data specified from the BIN logs/Transaction Logs. This process can take more time based on the amount of INSERT/DELETE/UPDATE ran on the instance from the last backup.
But we can bring the instance to any point/date-time we need in previous 2-3 days (BINLOG retention is 3 days).
PIT restore is a manual process involving multiple teams and this will be initiated in Critical conditions and cannot be treated as a Failover/Recovery/Back-out/Rollback Method.
******We do not do PIT restore directly on PROD and we will only fix forward and the PIT restore will be used only for Data comparison and Data recovery on TEMP/SUBPROD instances
Once we have the data on the TEMP instance, this can be used to compare the data between POST and PRE upgrade versions and we have standard operating procedures to transfer the data based on the individual cases.
10.What are the features impacted on the instance/application during an upgrade?
If any of your integration is not working during the upgrade, there is a high probability that that the job is NOT listed as “Upgrade Safe”.
We recommend NOT TO modify during the course of upgrade as we cannot comment on the aftereffect.
Custom scheduled jobs should NOT BE configured as “Upgrade Safe” as this field is there for a reason so that this will not interfere with the upgrade.
11.How to restrict users during an upgrade?
There are a few workarounds and customisation to achieve this and this should be implemented with your Developers/Partners and out of Customer support scope.
Few options available are as follows
• If your instance has Single Sign On implemented, you can restrict the access via the portal/identity provider for SSO/SAML.
• You can also run a custom script to logout all logged in users prior to the change/upgrade window.
• You can also create a Local Admin account and only admin users can access the Instance with a Local Account of the instance skipping the SAML/SSO login
12.Can ServiceNow Tech Support monitor my upgrade till completion?
Any functional/operational issues encountered during an upgrade will have to wait till upgrade is completed. We have seen most issues encountered during an upgrade will be rectified itself post upgrade as there are a lot of moving parts during an upgrade where there will be new fields/functionality getting added and removed during the course.
Monitoring the upgrade from the back-end localhost node log is a non-realistic approach as there will be up-to 150 lines created in the back-end per second.
The best way to Monitor the upgrade is via the Upgrade monitor window. Customer support also use this window to monitor the upgrades because of the above said reason.
Upgrade monitor will tell the number of plugins remaining and an estimate of what is upgrading and an estimate of remaining records per plugin as and when it run.
"sys_update_state" is a very important table where the duration of each activity is stored during upgrade. So if you think your update is stuck, you can look for this table on your SUB PROD DRY RUN ( only if it was a full clone of PROD) to get a ballpark estimate to compare with PROD upgrade.
In specific cases Tech support monitor the upgrade for active issues raised previously which were tracked and rectified during SUB PROD dry run and it would be communicated to the customer via an active case.
We do not recommend creating placeholder cases to Tech Support to monitor upgrades with the above said reason. We have different modules and SME teams for each functionality and a placeholder case will not help here as the cases should be addressed for individual teams for respective issues.
In case of any issues observed during an upgrade, please call our 24x7 support number and we will be able to assist you with that specific issue.
13.What to do when the node upgrade is failed/node is RED in Upgrade Summary?
All the user traffic will be diverted to the available nodes till this is sorted.
If you notice any nodes are having issues/down/failed during the course of upgrade, please wait till the Upgrade change is auto-closed. This will take up to 20-30 minutes from the Upgrade Summary report is generated.
If the issue is not resolved, please call Customer Support and we will manually upgrade the node or spin up a new node on the new version and disable the stale one.
14.Can I rollback the upgrade?
Only Patch upgrades can be rolled back.
If customer support is initiating the Rollback on PROD via Change Management, we will have to test this on a SUB PROD to make sure we are getting the intended result-set.
For this, we will restore PROD to a SUB PROD and try the Rollback to check if it is getting the intended result-set and get customer confirmation and then do it on PROD.
*****The rollback does not record schema drops (tables or columns, index drops are ok), re-parenting / column promotion, table truncate, table/column rename, column type changes, or column width decrease. These are blacklisted from the rollback and are not configurable.
ServiceNow recommends FIX FORWARD in case of any unforeseen catastrophic issues encountered post upgrade.
You might also need to look at the section for more information around data recovery options **What are my options to recover my data in case of Data corruption post upgrade?
More information on the above is on the below documentation:-
From London, instance admin can trigger this :-
15.How to use an Update Set to capture reverted customisations made after an upgrade?
An update set will NOT capture the Disposition, Resolution, Comment, etc. from the Instance’s upgrade log.
Upon committing the update set on new instance, these log entries( required columns) should be transferred using Import sets for housekeeping purpose.
16.Why is the skipped record count different from my SUB PROD instance?
We have seen this issue where the count is different between PROD and SUB PROD. This will happen if the SUB PROD was not a FULL clone(may be the logs/audit where excluded) of PROD. Please check the section **What is the best process to DRY RUN my upgrade? for more information.
17.Why is skipped updates count mismatch between “Upgrade Summary” and “Upgrade History” record?
The best place to get a consolidated list of what happened to the records during upgrade is from the table sys_upgrade_history_log and Group by “Disposition”
18.Can we cancel the upgrade while it is in progress?