This article is aimed at working out why inputs back from MID Servers are not getting processed. Symptoms would include:
- Ecc Queue table [ecc_queue] inputs, which are the results back from probes, remain in Ready state
- Which in turn would cause
- Discovery schedules get stuck and time-out
- Service Maps don't re-discover
- Workflows get stuck at an Orchestration Activity
- Events no longer come into Event Management
- Import Sets never receive the data from the JDBC Data Source
- MID Server related lists are not updated with Stats such as the Threads list.
- and there will be others...
Causes and Resolutions
The Sensor is inactive
For an input to change state from Ready to Processing to Processed, it needs a Sensor, and that is always implemented as an Insert Business Rule on the ecc_queue table. Most out-of-box features that use MID Servers will have a sensor, including:
|Discovery Probes||Discovery - Sensor|
|Orchestration Activities||Automation - Sensors|
|MID Server system||MID - Heartbeat, MID - Process XMLStats, etc.|
|Event Management||Event Management - Connector|
|Integrations||ECC Queue Retry Policy|
Check that the business rule is Active.
Check the Versions related list to see if the Business Rule has been customized, and revert to the out-of-box version.
There is no Sensor
If there is no sensor the input will remain in Ready state. That may be fine if it is the kind of fire-and-forget outbound integration where nothing needs to record or process the result. It's still best practice to have a sensor, even if all it does is mark the input processed.
For inputs from the RESTMessageV2 or SOAPMEssageV2 API, there is no out-of-box generic sensor. If no response is required and the API was executedAsync(), then these too can be ignored, but if the response does need processing then integrations using those APIs should have a sensor.
Solution: Implement a Sensor, so that the MID Server can be used asynchronously without causing performance issues.
ServiceNow KB: Best practices for RESTMessageV2 and SOAPMessageV2 (KB0716391)
The Scheduler workers are backlogged...
The MID Servers insert the inputs into the ECC Queue via SOAP and using the instance's API-INT semaphores, and in order not to block those most sensor business rules run 'async', meaning they run in the Scheduler Worker threads.You can check for a backlog:
- Navigate to System Diagnostics.
- On the page, find System Overview.
- If the scheduler is backed up, Events Pending displays in red.
To know exactly what is queued, you will find the ready and running jobs in the Scheduler table [sys_trigger], and the ecc_queue related ones will have a similar name to the sensor business rule. e.g. The "Discovery - Sensors" business rule will create sys_trigger records named "ASYNC: Discovery - Sensors". You can check that those are there:
- Navigate to System Scheduler -> Scheduled Jobs
- Add a filter for :
- State is Ready, or State is Running
- Next action is Today
- Next action 'relative' on or before 1 minute ago
- You'll end up with a list URL something like this:
That will give the backlog of scheduler worker jobs for today. You could filter on 'Name Contains' if you know the sensor.
If your job is there, in Ready state, then it has not been run yet. Possible causes:-
...because the instance is not keeping up with the workload
Scheduler Worker threads are a shared resource, used by all async business rules and scheduled jobs in the platform, and so something completely unrelated to the MID Server jobs could be delaying or blocking processing.
The above list would give you an idea of the Names and age of the scheduler queue. If you filter this for only State = Running jobs, you can see what is currently running in the scheduler workers.
Solution: From the list you may be able to identify some long running jobs, or identify large number of similar quick running jobs, that have in effect been blocking the queue.
Seeing what's currently running should allow you to identify particular jobs to investigate to see if they are functioning normally. e.g. An unusually large batch update of incidents could have triggered a huge cascade of email notifications to CMDB owners. You could find anything.
If the backlog gets particularly bad, you might consider opening an support incident to help you identify the cause of the backlog.
But, if nothing is currently running, it may be...
...because the instance is Paused
You can check this by running a background script as an admin user:
- Navigate to System Maintenance => Scripts - Background
- In the '' box, paste this script
- Click Run Script
If that returns "true" then someone or something set the workers on pause meaning they won't process jobs.
Solution: Check that the instance is not currently in the middle of an upgrade, and wait for it to finish if it is: Navigate to System Diagnostics -> Upgrade Monitor
If the instance is not in the middle of an upgrade, then you probably need to open a support incident, as something has gone wrong.
The Topic is LDAPConnectionTesterProbe, and it took more than 55 seconds
By default, the 'Test Connection' feature of LDAP Servers that use MID Servers will only wait 55 seconds for the result. This runs periodically, and when you click the link on the LDAP server form.
If the Input for the probe takes longer than that to return to the instance, then that input will remain in Ready state, because the code that was waiting to process it already gave up waiting.
See KB0743756/PRB1331240 LDAP "Test Connection" and "Browse" features can timeout, and LDAP Monitor may show Connection Status as Not Connected, due to running at Standard(2) MID Server priority - Did not get a response from the MID server after waiting for 55 seconds
The Sensor Crashed
In some cases a sensor may run, but crash before it has been able to update the state to Processing or Processed, and you are also very unlikely to see the Error String field populated with the reason. Checking the system log or the app node localhost logs is often the only way to see what happened.