Better index troubleshooting with ForgeRock DS / OpenDJ

Many years ago, I wrote about troubleshooting indexes and search performances, explaining the magicdebugSearchIndex” operational attribute, that allows an administrator to get from the server information about the processing of indexes for a specific search query.

The returned value provides insights on the indexes that were used for a particular search, how they were used and how the resulting set of candidates was built, allowing an administrator to understand whether indexes are used optimally or need to be tailored better for specific search queries and filters, in combination with access logs and other tools such as backendstat.

In DS 6.5, we’ve made some improvements in the search filter processing and we’ve taken changed the format of the debugSearchIndex value to provide a better reporting of how indexes are used.

The new format is now JSON based, which allow to give it more structure and all could be processed programatically. Here are a few examples of output of the new debugSearchIndex attribute values.

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "dc=example,dc=com" "(&(cn=*Den*)(mail=user.19*))" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"intersection":[{"index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"ser.19","candidates":111,"retained":111},{"index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"user.1","candidates":1111,"retained":111},
{"filter":"(cn=*Den*)", "index":"cn.caseIgnoreSubstringsMatch:6",
"range":"[den,deo[","candidates":103,"retained":5}], "candidates":5},"final":5}

Let’s look at the debugSearchIndex value and interpret it:

{
   "filter": {
     "intersection": [
       {
         "index": "mail.caseIgnoreIA5SubstringsMatch:6",
         "exact": "ser.19",
         "candidates": 111,
         "retained": 111
       },
       {
         "index": "mail.caseIgnoreIA5SubstringsMatch:6",
         "exact": "user.1",
         "candidates": 1111,
         "retained": 111
       },
       {
         "filter": "(cn=*Den*)",
         "index": "cn.caseIgnoreSubstringsMatch:6",
         "range": "[den,deo[",
         "candidates": 103,
         "retained": 5
       }
     ],
     "candidates": 5
   },
   "final": 5
 }

The filter had 2 components: (cn=*Den*) and (mail=user.19*). Because the whole filter is an AND, the result set is an intersection of several index lookups. Also, both substring filters, but one is a substring of 3 characters and the second one a substring of 7 characters. By default, substring indexes are built with substrings of 6 characters. So the filters are treated differently. The server optimises the processing of indexes so that it will try to first to use the queries that are the most effective. In the case above, the filter (mail=user.19*) is preferred. 2 records are read from the index, and that results in a list of 111 candidates. Then, the server use the remaining filter to narrow the result list. Because the string Den is shorter than the indexed substrings, the server scans a range of keys in the index, starting from the first key match “den” and stopping before the key that matches “deo”. This results in 103 candidates, but only 5 are retained because they were parts of the previous result set. So the result is 5 entries that are matching these filters.

Note the [den,deo[ notation is similar to mathematical Set representation where [ and ] indicate whether a set includes or excludes the boundaries.

Let’s take an example with an OR filter:

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "dc=example,dc=com" "(|(cn=*Denice*)(uid=user.19))" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"union":[{"filter":"(cn=*Denice*)", "index":"cn.caseIgnoreSubstringsMatch:6","exact":"denice","candidates":1}, {"filter":"(uid=user.19)", "index":"uid.caseIgnoreMatch","exact":"user.19","candidates":1}],"candidates":2},"final":2}

As you can see, the result is now a union of 2 exact match (i.e. reads of index keys), each resulting a 1 candidate.

Finally here’s another example, where the scope is used to attempt to reduce the candidate list:

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "ou=people,dc=example,dc=com" -s one "(mail=user.1)" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"filter":"(mail=user.1)","index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"user.1","candidates":1111},"scope":{"type":"one","candidates":"[NOT-INDEXED]","retained":1111},"final":1111}

You can find more information and details about the debugsearchindex attribute in the ForgeRock Directory Services 6.5 Administration Guide.

0 Comments

Leave a reply

©2019 ForgeRock - we provide an identity and access platform to secure every online relationship for the enterprise market, educational sector and even entire countries. Click to view our privacy policy and terms of use.

Log in with your credentials

Forgot your details?