OpenDJ, the open source LDAP directory services, makes use of indexes to optimise search queries. When a search query doesn’t match any index, the server will cursor through the whole database to return the entries, if any, that match the search filter. These unindexed queries can require a lot of resources : I/Os, CPU… In order to reduce the resource consumption, OpenDJ rejects unindexed queries by default, except for the Root DNs (i.e. for cn=Directory Manager).
Today, I’m going to show you how to monitor for unindexed searches by keeping a dedicated log file, using the traditional access logger and filtering criteria.
First, we’re going to create a new access logger, named “Searches” that will write its messages under “logs/search”.
dsconfig -D cn=directory manager -w secret12 -h localhost -p 4444 -n -X create-log-publisher --set enabled:true --set log-file:logs/search --set filtering-policy:inclusive --set log-format:combined --type file-based-access --publisher-name Searches
Then we’re defining a Filtering Criteria, that will restrict what is being logged in that file: Let’s log only “search” operations, that are marked as “unindexed” and take more than “5000” milliseconds.
dsconfig -D cn=directory manager -w secret12 -h localhost -p 4444 -n -X create-access-log-filtering-criteria --publisher-name Searches --set log-record-type:search --set search-response-is-indexed:false --set response-etime-greater-than:5000 --type generic --criteria-name Expensive Searches
Voila! Now, whenever a search request is unindexed and take more than 5 seconds, the server will log the request to logs/search (in a single line) as below :
$ tail logs/search [12/Sep/2016:14:25:31 +0200] SEARCH conn=10 op=1 msgID=2 base="dc=example, dc=com" scope=sub filter="(objectclass=*)" attrs="+,*" result=0 nentries= 10003 unindexed etime=6542
This file can be monitored and used to trigger alerts to administrators, or simply used to collect and analyse the filters that result into unindexed requests, in order to better tune the OpenDJ indexes.
Note that sometimes, it is a good option to leave some requests unindexed (the cost of indexing them outweighs the benefits of the index). If these requests are unfrequent, run by specific administrators for reporting reasons, and if the results are expecting to contain a lot of entries. If so, a best practice is to have a dedicated replica for administration and run these expensive requests. Also, it is better if the client applications are tuned to expect these requests to take a long time.