6855 views

New clone engine changes

 

Overview


A new cloning engine has been released to customers that utilizes the latest backup instead of live data. It is referred to as the "backup-based clone engine," while the legacy approach is referred to as the "in-app clone engine."

Note the following:

  • Clones use the last backup taken from the source instance. There are no more live data clones unless the backup-based clone engine fails and falls back to the old in-app clone engine, which clones from the Standby database only.
  • Backup-based clone engine rollbacks cannot be performed on the customer's in-app clone history; they can now only be performed on the datacenter clone record by ServiceNow staff within 24 hours of clone completion.
  • Cross DC clones work on the backup-based clone engine, but will fail if a fall back to the in-app clone engine occurs.
  • The staging process is like an unzip process that is running against the last backup taken. 

How does the new engine work?


The new backup-based clone engine supports Gen 2 and Gen 3.

The goal of the backup-based clone engine is to process clones faster, but the new functionality can raise some questions for end users. The new process can be seen in the CHG record.

The following is a step-by-step description of the process using an example change, with a planned start date of 2017-11-20 23:00:00 (GMT):  

  • Upon initial creation of the CHG for the clone request, we see this message:
2017-11-20 09:47:52 Datacenter Automation (NOW) 

Original Request: System Clone <source.instance> over <target.instance> at 2017-11-20 23:00:00 (GMT). Please Note: This is an automated process.

Progress and automated milestones updates can be found within the clone history record on the source instance. Please contact Customer Support with questions.

  • The clone preparation is started 4 hours ahead of start time

2017-11-20 19:25:02 Datacenter Automation (NOW) 

Clone Progress : 1% 

Starting Clone Preparation

 

  • Preflight checks to ensure the clone can be executed are run, we see this message:
2017-11-20 19:33:25  Datacenter Automation (NOW) 

Clone Progress : 2.3% 

Clone has passed Pre flight checks


  • Backup retrieval is complete. Target instance is still not impacted. Note that backup retrieval times can vary depending on the size of the backup. The date of the backup is labeled as well.
2017-11-20 19:38:20  Datacenter Automation (NOW)

This clone will use data from a backup created at : 2017-11-20 02:15:34 GMT


  • A new database is provisioned for the target instance and upon successful competition, we will see this message:

2017-11-20 19:52:25 Datacenter Automation (NOW)

Provisioning a new database for target instance successful.

 

  • Backup will now extracted to the target database. Target instance is not impacted.
2017-11-20 19:57:58  Datacenter Automation (NOW) 

Clone Progress : 12.9% 

Started copying the data onto target database.


  • An estimate is provided based on the previous restores, we will see this message:
    NOTE:- This can only be taken as an approximation since there might be changes to the size of the instance after the previous restore.

2017-11-20 19:58:12  Datacenter Automation (NOW) 

Based on previous restores run time, restore for primary mysql instance <MYSQL Instance details of customer> will take roughly 1 hours, 15 minutes

 

  • Backup extracted to target database.
2017-11-20 21:10:50  Datacenter Automation (NOW) 

Clone Progress : 33.9% 

Finished copying data to the target database. Applying data preservers on the target database.


  • Switching over to the new cloned target instance.From this time onwards the target instance is impacted,we will see this message:
2017-11-20 23:07:51 Datacenter Automation (NOW)

Switching to the new cloned target instance. The target instance will be offline for a few minutes during the switchover

  • Clone time arrives, but there is no notification until the clone has completed.
2017-11-20 23:31:28  Datacenter Automation (NOW)

Clone Progress : 100% 

Clone completed successfully.


  

 

Changes to information reporting


How information is reported in the clone record has changed. The tabs for Database Table Clones and Preserved Data were removed. For now, the Clone Log tab remains.

 

Falling back to in-app cloning


There are certain situations where the backup-based clone engine may not be used and the process reverts to in-app cloning. For example:

  • restore failed
  • backup was not usable
  • no capacity was found

If there are capacity issues encountered during the preflight check, for example, the clone is from xxlarge/xlarge/large database type to pico type, we will not allow the clone to fallback to the in-app clone engine when it is a cross DC clone. Note that this also applies within the same Datacenter pair. Error that is logged on the instance: 

"We don't support this clone capacity mapping: [source]xxlarge;[target]pico" 
"Clone was not in ready state, exiting without running clone (state: Error)" 
"Clone Failed."

 

Upgrading a source instance and cloning on the same day


If a source instance is upgraded and a clone request takes place on the same day, the backup-based clone engine must consider falling back to the in-app clone engine. Because the backup-based clone engine performs a clone from the most recent backup, it detects that there was an update to the glide.war on the source instance and realizes that the backup has a different glide.war than the live data glide.war. Since both glide.war on the backup and the live data are different, the clone engine performs the 30-minute workflow.

The 30-minute workflow performs several checks to decide if it should use the in-app clone engine or the backup-based clone engine. Once the checks are complete, the record generates in datacenter updates. 

 

Clone can fail if it has complex data preserver settings


If there are any complex data preserver settings, such as dot walking or preserving hierarchical tables, the clone automatically fails and cancels before it starts. This is an important point to understand, as it is not uncommon to have such settings. Clones fail if data preservers are configured for tables that extend task or cmdb_ci.

 

Clone fails if core instance properties are not preserved


In Clone Preflight Check, clone_data_preserver table is queried for the record with name=Core Instance Properties and table=sys_properties, If record is not found, Clone fails with error message saying Not able to find Core Instance Properties in clone_data_preserver table. core instance properties contain the instance details like instance name, instance id etc , which are particular to that instance and needs to be preserved on target instance.

Action to be taken :

Navigate to clone_data_preserver table list, and search for the record with name as Core Instance Properties.

  • If there is a record with this name check the table name: it should be sys_properties, if not please correct it.
  • If there is no record with that name, search for the record with Table as sys_properties. There can be more than one record, look for the record with name starts with Core Instance. If record is found change the name to Core Instance Properties, if there is no record which starts with Core Instance, then probably the record might have been deleted. In that case look at the below snapshot and create a new entry in clone_data_preserver table.

reference Core Instance Properties entry in clone_data_preserver table.

 

Clone pre-flight check failures


Validations are performed when clone request is submitted. the request will be rejected if one of below checks fails.

1. Developer Instance check failure:

    The clone request will be rejected if the source or target instance is a developer instance, such selection is not supported currently

2. Capacity mapping check failure:

 

The clone request will be rejected if the mapping between the source instance's capacity size and the target instance's capacity size is one of mappings shown below. The reason is that a clone with such mapping is very likely to result in target instance being unusable. If the target capacity is less than small, any clone will fail.

  Source Capacity Size  Target Capacity Size

 

 Source Capacity Size Target Capacity Size
(small,medium,large,xlarge,xxlarge,mega,ultra,,giga,tera,peta)(pico,nano,micro)
  

 

  

    

 

 
 

 

Article Information

Last Updated:2018-03-20 09:03:04
Published:2018-03-20