From OpenNMS
Coming from a Netcool environment and later Nagios, I got used to the fact that the alarm count and last update time got updated during the existence of an alarm/alert. With Nagios I've used this to send out notifications (mail and sms) for every poll that failed.
Since the nodeLostService events and alarms are only updated when a state change has been seen, this would not give me the opportunity to have the system send out periodic notifications for as long as the outage exists.
After having posted to the mailing list and, I decided to have a look at the Automations.
The following is an automation cloning a nodeLostService event, for the life time of a nodeLostService Alarm. It can be added to vacuumd-configuration.xml.
Note that for simplicity and overview, I've left out all other automations, triggers and actions.
<automations>
<!-- run this every minute -->
<automation name="repeatEvent" interval="60000" active="true"
trigger-name="selectAlarmsToRepeat"
action-name="updateAlarmToRepeat"
action-event="eventCloned" />
</automations>
<triggers>
<!--
Find alarms of type uei.opennms.org/nodes/nodeLostService, with a severity > 2
We reintroduce the event back into the system as a new event, to be able to send repeated
notifications
- skip if the alert is acknowledge
- skip non alarmtype 1 alerts
-->
<trigger name="selectAlarmsToRepeat" operator=">=" row-count="1" >
<statement>
SELECT a.alarmid AS _alarmid,
a.eventuei AS _eventuei,
a.nodeid AS _nodeid,
a.ipaddr AS _ipaddr,
a.serviceid AS _serviceid,
a.logmsg AS _logmsg,
s.servicename AS _servicename,
now() AS _ts
FROM alarms a
LEFT OUTER JOIN service s
ON s.serviceid = a.serviceid
WHERE eventuei='uei.opennms.org/nodes/nodeLostService'
AND severity > 2
AND alarmacktime IS NULL
AND alarmtype = 1
AND COALESCE(lastautomationtime, lasteventtime) < now() - interval '30 minutes'
</statement>
</trigger>
</triggers>
<actions>
<action name="updateAlarmToRepeat" >
<statement>
UPDATE alarms
SET firstautomationtime = COALESCE(firstautomationtime, ${_ts}), lastautomationtime = ${_ts}
WHERE alarmid = ${_alarmid}
</statement>
</action>
</actions>
<action-events>
<!--
Create an action event, which in essence clones the original event/alarm settings
-->
<action-event name="eventCloned" for-each-result="true" >
<assignment type="field" name="uei" value="${_eventUei}" />
<assignment type="field" name="nodeid" value="${_nodeid}" />
<assignment type="field" name="interface" value="${_ipaddr}" />
<assignment type="field" name="service" value="${_servicename}" />
<assignment type="parameter" name="alarmId" value="${_alarmid}" />
<assignment type="parameter" name="logmsg" value="${_logmsg}" />
<assignment type="parameter" name="alarmEventUei" value="${_eventUei}" />
</action-event>
</action-events>
Note that this can lead to a lot of additional events and notifications in the GUI. It shouldn't be too hard to cleanup, or auto-ack them, prior to cloning the event.






