Projects tigase _server server-core Issues #1024
Monitor notification changes (#1024)
Artur Hefczyc opened 6 years ago

We need some changes for the monitor notifications:

  • One summary notification a day with brief status from all monitored systems: MEM, CPU, XMPP, and other. Similar to what we had in the past, attached example.
  • If there is any WARNING or SEVERE problem with any subsystem, the first notification is sent right away, any subsequent notifications for this subsystem are sent once in an hour, until the problem is fixed. Once the problem is fixed, then again, when something is wrong, first notification is sent right away and then once an hour
  • Notifications about log entries are sent at most once again.
  • Notifications about log entries cannot exceed 1MB in size, preferably they should be at most 100kB
  • Email Notification should specify in email Subject the subsystem. For example: Memory Monitor, Log Monitor, CPU Monitor
  • If there is a problem discovered, the notification should also have a brief description about the problem before providing actual data. For example, if there is high CPU activity, the email should say at the beginning: "High CPU Activity discovered: 98%", or "Log entry level WARNING/SEVER discovered", .....
  • Actually, every notification from the monitor should contain full information, like in the attached screenshot. Just the problematic metrics should be indicated by a different color (green - OK, yellow - warning, red - severe). Plus, of course, additional information or attachment if necessary.

Screenshot 2019-04-23 09.45.57.png

Artur Hefczyc commented 6 years ago

@bmalkow I received about 1,000 notifications over last few days, so these changes are really citical.

Referenced from commit 1 year ago
issue 1 of 1
Type
Task
Priority
Critical
Assignee
Subsystem
monitor
Issue Votes (0)
Watchers (0)
Reference
tigase/_server/server-core#1024
Please wait...
Page is in error, reload to recover