November 27, 2014 at 10:13 am #1517alex.nalinParticipant
Hello FR community,
What services/check command would you recommend to include into a System Monitoring (such as Nagios, Opsview, etc.) to monitor OpenDJ?
Do you have any experience to share about how you manage the monitoring of OpenDJ in your environments? What are exactly the components you would take under control?
Thank you in advance.December 2, 2014 at 5:34 pm #1545LudoModerator
There are a number of existing scripts and integration with different Monitoring solutions such as Nagios, for OpenDJ, either leveraging LDAP monitoring, SNMP or JMX.
Unfortunately, as the product manager for OpenDJ, I don’t have much experience with real live service monitoring. However, I can give some information about the OpenDJ components monitoring (using LDAP, but equivalent MBeans exist for JMX):
OpenDJ exposes general information, which shouldn’t change much, but worth to identify an instance
The full version of OpenDJ is in it’s own cn=Version, cn=monitor entry (static)
Additional system data can be found in cn=System Information,cn=monitor : location of the instance, hostname, OS and HW info, JVM version and path, JVM supported SSL protocols and ciphers…
To have an idea of the overall load, you can read the main entry dn: cn=monitor.
In addition to identify the server (version, vendor…) this entry contains the uptime, the current number of open connections as well as the max. The cn=Work Queue,cn=Monitor entry has information about the work queue (its backlog, average and max backlog)
There are per operation statistics, that are spread in each connection handlers (LDAP, LDAPS, Administration Connector…). The best way to access them is to use a filter on their specific objectclass : (objectclass=ds-connectionhandler-statistics-monitor-entry).
One would probably only need to check the LDAP and LDAPS connection handler stats (stats for Admin Connectors should have little value as this handler is restricted to administrative tools such as Control-Panel, dsconfig…).
The entry contains 3 sets of statistics :
– overall read/write stats (count of messages, bytes).
– The second set of stats is based on Sun DSEE stats and has the number of requests and responses for each operation.
– The third set is more oriented towards performance metrics since it contains the count of each operation finished (xxx-total-count and xxx-total-time), and the total execution time of these operations. That later set can be really useful to monitor performances, and more specifically running averages of operation times, by querying it on regular basis and computing diffs from previous call. Both the average ops/sec and etime/operation can be computed.
Finally, the monitoring entries under cn=Disk Space Monitor,cn=monitor, will return disk location and disk space availability for each database backend. These are worth monitoring to trigger alerts if the free disk is reducing too much (note that OpenDJ has an alert mechanism built in for this). This said, if the threshold have not been set appropriately, disk space may fill in so quickly that alerts become useless.
And one can look at the number of entries in each backend and base DNs, which is also available as part of the ds-backend-monitor-entry objects. This data could be interesting to monitor and relate to all replicas, but also as a general metrics to understand growth of the data and possible actions.
I hope this helps.
LudoDecember 9, 2014 at 2:54 pm #1775alex.nalinParticipant
This information helps out and clarify a lot!
Thanks again for your precious contribution.
You must be logged in to reply to this topic.