Basic googling usually shows articles with alerts via exec
. That makes configuration overcomplicated as you have to duplicate it for an alert and for a real action. However you can setup your sendmail to do the slack messaging.
- name: put monit config
copy:
dest: /etc/monit.d/foo.conf
content: |
set alert slack.monit@localhost but not on { instance }
set mailserver 127.0.0.1
set mail-format {
message: MONIT_EVENT $EVENT
MONIT_SERVICE $SERVICE
MONIT_DATE $DATE
MONIT_HOST $HOST
MONIT_ACTION $ACTION
MONIT_DESCRIPTION $DESCRIPTION
}
# some basic check as an example
check process foobar with pidfile /var/run/foobar.pid
start program = "/sbin/start foobar"
stop program = "/bin/bash -c '/usr/local/bin/delayed-rolling-cluster-stop 2>&1|logger -t foo-bar'" with timeout 600 seconds
if failed
host 127.0.0.1
port 8080
protocol http
request /ping
timeout 10 seconds
then
restart
notify: reload monit
Note that you may want to make your slack URL a bit more secured. ;-)
- name: put mail2slack script
copy:
dest: /etc/smrsh/mail2slack.sh
content: |
#!/bin/bash
formail -I ""| {
while read k v
do
test -n "$k" -a -n "$v" && declare $k="$v"
done
URL=https://hooks.slack.com/services/{{ slack_hook_uri }}
COLOR=${MONIT_COLOR:-$(echo $MONIT_EVENT | egrep -qi "succeeded|exists|recovery| ok|done| up" && echo good || echo danger)}
TEXT=$(echo -e "$MONIT_EVENT: $MONIT_DESCRIPTION" | python -c "import json,sys;print(json.dumps(sys.stdin.read()))")
PAYLOAD="{
\"attachments\": [
{
\"text\": \"$MONIT_EVENT: $MONIT_DESCRIPTION\",
\"color\": \"$COLOR\",
\"mrkdwn_in\": [\"text\"],
\"fields\": [
{ \"title\": \"Date\", \"value\": \"$MONIT_DATE\", \"short\": true },
{ \"title\": \"Host\", \"value\": \"{{ env }}: $MONIT_HOST\", \"short\": true }
]
}
]
}"
curl -s -X POST --data-urlencode "payload=$PAYLOAD" $URL
}
mode: 0755
- name: line to aliases
lineinfile:
dest: /etc/aliases
line: 'slack.monit: |/etc/smrsh/mail2slack.sh'
notify: reload aliases
handlers:
- name: reload aliases
shell: newaliases
- name: reload monit
service: name=monit state=reloaded
delayed-rolling-cluster-stop
is a script to restart malfunctioning node in a cluster of 3 nodes and should prevent simultaneous restart of all the nodes. Each node has a 2 minute restart window based on node_id
parameter in config file.
- name: put delayed-rolling-cluster-stop
copy:
dest: /usr/local/bin/delayed-rolling-cluster-stop
content: |
#!/bin/bash -x
delay=$(( (($(date +%M)+$(grep node_id /etc/foo-bar.yml|sed -E 's/.*:\s"([0-9]+)"/\1/')*2) % 6)*60 ))
echo "delay=${delay}"
sleep $delay
/usr/bin/killall -INT foo-bar || /sbin/stop foobar
mode: 755