cts:document-permission-query()

Released in MarkLogic 11

personClever Llamas
calendar_today2023-01-31

By far, this has been the most useful of the small updates in the MarkLogic 11 release is cts:document-permission-query(). Prior to this release, knowing what content had certain permissions was a time/memory/cpu consuming task. You would have to run a query with a user having the role in question.

If one wanted to compare more than one role such as "Which documents have read access by Role-A and do not have read access by Role-B" becomes more complex. In the end, such questions would be done in-memory against various sequences of URIs. Now, we can just stay in the cts-query space and resolve all of this at query time.

Setup

Like all of our in-depth investigation of features, we try to set up data that will truly test the features. For this simple feature, testing the pre-MarkLogic-11 solution against the MarkLogic 11 example requires roles as well as data with various combinations of permissions. For brevity, we have left the results to simple estimates where the original tests proved also that we had exactly the documents expected. However, for completeness, We have included the full setup code. The setup here may be a bit complex(like spawning to cut down on time to load). However that is because in the full tests, we used more data and items like the uris and attributes to further validate the data and code.

xquery version "1.0-ml";
import module namespace sec="http://marklogic.com/xdmp/security" at 
    "/MarkLogic/security.xqy";

declare function local:create-roles($role-names){
  for $role-name in $role-names
    return xdmp:invoke-function(function(){
      if(not(sec:role-exists($role-name)))
        then sec:create-role($role-name, "sample role for content validation of cts:document-permission-query()", (), (), (), (), (), ())
        else ()
    }, map:entry("database", xdmp:security-database()))
};

declare function local:create-docs($permissions, $prefix, $iterations, $number-per-iteration){
  (: spawn to get them in faster :)
  for $x in (1 to $iterations)
    return xdmp:spawn-function(function(){
        for $y in (1 to $number-per-iteration)
          let $uri := "/clever-llamas/test/cts-document-permission-query/" || $prefix || ":" || $x || "-" || $y
          return (1,  xdmp:document-insert($uri, <llama prefix="{$prefix}" x="{$x}" y="{$y}"/>, map:entry("collections", "/clever-llamas/test/cts-document-permission-query")=>map:with("permissions", $permissions)))
    }, map:entry("result", fn:true()))
};

(: roles :)
let $role-names := ("llama-writer", "llama-herder", "llama-walker")
let $_ := local:create-roles($role-names)

(: Permissions that we will use in the various sets of data :)
let $llama-writer-permissions := (
  xdmp:permission("llama-writer", "insert", "object"),
  xdmp:permission("llama-writer", "node-update", "object"),
  xdmp:permission("llama-writer", "update", "object")
)
let $llama-herder-permissions := (xdmp:permission("llama-herder", "read", "object"))
let $llama-walker-permissions := (xdmp:permission("llama-walker", "read", "object"))

return fn:count((
  (: 90000 with write only :)
  local:create-docs(($llama-writer-permissions), "writer-only", 30, 3000),
  (: 90000 where the herder can also read, but not the walker :)
  local:create-docs(($llama-writer-permissions, $llama-herder-permissions), "herder", 30, 3000),
  (: 90000 where the walker can also read, but not the herder:)
  local:create-docs(($llama-writer-permissions, $llama-walker-permissions), "walker", 30, 3000),
  (: 90000 where the walker AND herder can read:)
  local:create-docs(($llama-writer-permissions, $llama-walker-permissions,  $llama-herder-permissions), "walker-and-herder", 30, 3000)
))

In this case, there are 360,000 documents loaded:

  • 90,000 with write only
  • 90,000 where the herder can read, but not the walker
  • 90,000 where the walker can read, but not the herder
  • 90,000 where the walker AND herder can read

Documents

Samples

Below we have tried to answer the same question first in MarkLogic 10 as well as MarkLogic 11. We've included one example. However it is easy to expand on the same with different combinations.

QUESTION

Which documents have read access by both the llama-walker AND llama-herder?

MarkLogic 10 Solution

This question is not easily answered in versions prior to MarlLogic 11. Even with a TDE template and xdmp:node-permissions(), there is no way to un-pack that set of permissions easily in optic prior to MarkLogic 11.

For this, I have usually had to take a 3-step approach:

  • Generate a temporary user with the role in question
  • Invoke a function running as that user to get the URIs
  • Tear down the user in question
xquery version "1.0-ml";
import module namespace sec="http://marklogic.com/xdmp/security" at 
    "/MarkLogic/security.xqy";

(: function for creating temporary user and attaching to a role :)
declare function local:create-temporary-user-for-role($role-name){
  let $user-name := "clever-llamas-temp-" || sem:uuid-string()
  
  let $_ := xdmp:invoke-function(function(){
    sec:create-user(
        $user-name,
        "temporary-user for document-permissions-query",
        sem:uuid-string(),
        $role-name,
        (),
        ()
    )
   
  }, map:entry("database", xdmp:security-database()))

  return $user-name 
};

(: function for deleting temporary user:)
declare function local:delete-temporary-user($user-name){
  xdmp:invoke-function(function(){
    if(fn:starts-with($user-name, "clever-llamas-temp-"))
      then
        xdmp:invoke-function(function(){sec:remove-user($user-name)}, map:entry("database", xdmp:security-database()))
      else
        ()
  }, map:entry("database", xdmp:security-database()))
};

(: Temporary users :)
let $llama-walker-user := local:create-temporary-user-for-role("llama-walker")
let $llama-herder-user := local:create-temporary-user-for-role("llama-herder")

(: URIs for llama-walker:)
let $llama-walker-uris := xdmp:invoke-function(function(){
    cts:uris((), (), cts:collection-query("/clever-llamas/test/cts-document-permission-query"))
    }, map:entry("userId", xdmp:user($llama-walker-user))
  )

return xdmp:invoke-function(function(){
    cts:estimate(cts:and-query((
          cts:collection-query("/clever-llamas/test/cts-document-permission-query"),
          cts:document-query($llama-walker-uris)
        ))
      )
    }, map:entry("userId", xdmp:user($llama-herder-user))
  )


This takes some time to run. It is not surprising that the majority of time is taken in running the final query. This would be comparing to the entire list of URIs from the llama-walker - of which only 1/2 would match.

pre-11

MarkLogic 11 Solution

In MarkLogic 11, we can simply pass a very simple query and have it resolve against the fragments and the embedded permissions as accessed via the cts:document-permissions-query()

xquery version "1.0-ml";

(: documents that can be read by llama herder AND llama-walker:)
let $query := cts:and-query((
  cts:collection-query("/clever-llamas/test/cts-document-permission-query"),
  cts:document-permission-query("llama-herder", "read"),
  cts:document-permission-query("llama-walker", "read")
))

return cts:estimate($query) (:cts:uris((), (),  $query):)

As we can see, the results are what we would expect from something resolved immediately at the index level.

example-11

Conclusion

In a real-life example recently, I had to answer the question: "Which documents are missing read permission for role X". This requires essentially listing all URIs in the system and comparing them to a document query run as admin. In MarkLogic 10 across a cluster with a considerable amount of data, that took 7 minutes. Once you are in a cluster, the more nodes, the more you have to marshal lists of uris across the wire to query on each D-Node. That is expensive. The same question in MarkLogic 11 would have been simple and fast to answer.

Even with these simple examples, this API function makes a huge difference in code complexity, resource utilization and performance.

Need Some Help?


Looking for more information on this subject or any other topic related to MarkLogic? Contact Us (info@cleverllamas.com) to find out how we can assist you with consulting or training!