- This article will demonstrate details about what is an alert, how the alerts are created, handled by ServiceNow EM, processing life cycle, etc.
2. What is an Alert?
- A notification to draw attention to one or more Events is what I call an Alert. Events trigger “alerts” to notify responsible parties to take actions before things go wrong. So the flow is as below
Data collection > Events > Alerts
Lots of things > Some things > Few things
- Any event that meets or exceeds defined condition/thresholds that require immediate attention/action by 'service providers' (sysadmins, DBAs, network engineers, product managers, service managers, service desk) are converted to alerts.
- Please refer to Event Processing for more information on how the events are handled and processed.
3. Alert Processing Flow
- The below diagram explains the alert processing flow based on different stages.
- The event rule mechanism is used to categorize and process the event based on certain criteria. Each rule is defined has conditions like the source of event or maintenance state, etc.
- If the condition is passed we either continue with processing the event or "Ignore the event".
- The outcome of processing is an Alert.
- Refer to Event Rules for more information on event rules.
- Post successful processing of events an alert record is generated.
- The newly generated alert has 4 different states; Open, reopen, closed and flapping.
- Each state has the execution flow which is explained in detail in the next section.
4. The Alert States and Processing
- There are 4 alert states; Open, Closed, Reopen and flapping. All these states have different execution flow and code associated. Below are the details of each state.
- The first stage in the processing of Alert is Open. When an event is processed successfully it creates an alert.
- An alert is opened whenever an event is not ignored or its threshold is exceeded by an event rule, and de-duplication does not identify the event as belonging to an existing alert.
- These alerts are picked by "Event Management - Evaluate Alert Management Rules" scheduled Job which executed every 11 seconds
- It calls the evalauteAlert() function of Script Include "EvtMgmtAlertManagementJob".
- During the evaluation process, Alert Management rules are used to filter the alerts and perform the remediation action accordingly.
Note - Do not delete an open alert. Close an alert first and then delete it.
- For an open alert if the Clear event is triggered then the corresponding alert associated is set to the "Closed" state.
- Closing an alert also closes any related incident that is not already resolved or closed.
- If there is no associated Incident, then no only state is changed to Closed and no further action is performed.
- When new additional events are generated which on processing finds existing closed alert then the alert is reopened. An alert can be reopened manually.
- Reopening of existing closed alert by new events is controlled by property "evt_mgmt.active_interval".
- By default value of this property is 14400 sec. This means that if an alert is closed and a new event is generated within 4 hours which matches the same message key then the existing alert is reopened.
- When an alert is reopened, the related incident is processed as follows:
- If the incident is not Resolved or Closed, a work note is added to indicate that the related alert was reopened.
- If the incident is Resolved or Closed, the incident is reopened, a new incident is created, or nothing is done, depending on the evt_mgmt.alert_reopens_incident property value.
- If the incident is reopened, work notes are added to the incident.
- If a new incident is created, any matching alert management rule, alert action rule, and task template apply to the incident.
- If there is no matching alert rule or template, fields from the existing incident are copied to a new incident.
- The business rule that gets executed post alert reopen is "Reopen associated closed incident"
- This BR calls for script include "EvtMgmtAlertManagementAlertReopenHandler" which again invoke the <<< Alert Management - Add hyper Link>> process to find the correct rule and perform the remediation action.
- Flapping is a state when multiple open-closes events are generated for an associated closed alert.
- The flapping state entry is determined using the value configured for "evt_mgmt.flap_interval" and "evt_mgmt.flap_frequency" .
- An alert enters the flapping state when its current Flap Count value reaches or exceeds the given evt_mgmt.flap_frequency property value within the time period specified by the evt_mgmt.flap_interval property.
- There a scheduled Job "Event Management - close flapping alerts" which executes every 5 minutes and processes the flapping alerts.
5. Additional Information
- It denotes that the alert is known, and can temporarily be ignored.
- Acknowledging the alert does not assign it to you, nor does it create a task like an incident or change request. It simply lets other operators know that you are aware of the issue. After you acknowledge it, you will take further action during the triage stage.
Auto Closing Alert
- evt_mgmt.alert_auto_close_interval - An interval (in hours), within which open alerts will be automatically closed; Setting to 0 disables the feature.
- evt_mgmt.alert_closes_incident - Closing the alert will Resolve Incident or Close Incident or Do nothing.
- evt_mgmt.alert_reopens_incident - Reopening alert will Create New Incident or Reopen Incident or Do nothing
- evt_mgmt.incident_closes_alert - If true then resolving an incident closes the associated alerts, else no action will taken.
Points to focus
- Business rules created on alert tables should not take more than a few milliseconds. In place of using a business rule, consider if the same functionality can be achieved using a job.
- Do not use business rules to associate an alert with a CI. Use event rules to do binding instead of using business rules.