It is possible that threads running in MID Servers get stuck, and then block other MID Server threads such as Probes or Patterns. To aid debugging, you can schedule a regular thread dump (jstack) of the MID Server application, to help understand what is blocked or waiting for what.
- Open a list of all MID Servers: In the navigation MID Server -> Servers. (ecc_agent table)
- Filter the list down to only the MID Servers(s) you want Thread Dumps for. e.g. Name Contains 'Disco'.
- Add an extra condition for Status IS Up, so that we only send new thread dump jobs to MID Servers that are running. (we need to avoid a backlog building up)
- Right click the blue filter line and 'Copy query'. This will give you the Encoded Query String for this list filter that we will use in the script. e.g. "nameLIKEdisco^status=Up"
- Open a new Scheduled Script record (sysauto_script): System Definition -> Scheduled Jobs, click New, click Automatically run a script of your choosing.
- Fill in Name: (custom) MID Servers Thread Dump
- Fill in the schedule fields. e.g. Run: Periodically, Minutes: 10. Pick a time when no jobs are likely to be running in your MID Servers.
- Paste the following script into the Run this script field:
// scheduled script to regularly do a MID Servers Thread Dump (KB0725067)
var midGr = new GlideRecord('ecc_agent');
midGr.addEncodedQuery('<query string goes here>'); // You may want additional conditions to limit which MID Servers are involved.
var agent_name = midGr.name.replace(/'/g, "\\'");
var midmanage = new MIDServerManage();
midmanage.threaddump(agent_name); // This line writes the thread dump ecc_queue output to the mid server.
gs.info('(custom) Running MID Server Thread Dump for MID Server: ' + agent_name);
- Paste your query string copied earlier into the addEncodedQuery function highlighted above.
- Submit. The script will now run on the next scheduled time.
- To test this script, or run it on demand, use the Execute Now button.
- You can then Grab Logs for a MID Server to see the thread dumps in the wrapper.log file.
Note: This will spam the MID Servers wrapper logs, which may have consequences for disk space, so be sure to deactivate this job once you have finished your debugging.