Friday, June 20, 2014

Using JCR-SQL2 for querying ACLs in a Jackrabbit repository

Keywords:
jcr-sql2 query jackrabbit ACL ACE access controls nt:hierarchyNode rep:policy

Problem:
The ultimate problem was actually how do you handle 'gracefully' removing Principals from the (default) jackrabbit security workspace when there's the potential they are being referenced by either: (a) groups; or (b) access controls (ACLs). This may be a topic for another (much more detailed) post, but for now the focus is on (b) detecting if there are any ACLs in the repository that reference the Principal.

Aside: why is it a concern? Removing the Principal won't effect enforcement of the access controls, but anything attempting to process the ACL definition will hit an error for the missing Principal:
javax.jcr.InvalidItemStateException: Item does not exist anymore: 6e332039-2956-323c-8e82-212de8f88ff0`

The AccesControl documentation on the jackrabbit wiki states:
How Resource-based ACLs are stored
Resource-based ACLs are stored per resource/node in a special child node rep:policy. This one will have a list of rep:GrantACE child nodes (usually named allow, allow0,...) for grant access control entries and rep:DenyACE child nodes (usually named deny, deny0,...) for deny access control entries.

Each ACE node has a rep:principalName STRING property pointing to the user or group this ACE belongs to, and a rep:privileges NAME multi-value property, containing all the privileges of this ACE.

Note that you can read/browse these nodes using the JCR API, but cannot modify them. This must always happen through the JCR access control API.

How do you target these rep:policy items in a JCR-SQL2 query ... and more importantly has anyone done this before - to save me the time?


Solution:
I couldn't find an example, but by careful reading of the builtin_nodetypes.cnd (and some trial and error) the following query will list all access control entities (ACEs) that reference a given Principal (i.e a User or Group) with-in a given access control list (ACL) set on a resource (file or folder).
select resource.*, ace.*
    from [nt:hierarchyNode] as resource
    inner join [rep:ACL] as acl
       ON ISCHILDNODE(acl, resource)
    inner join [rep:ACE] as ace
       ON ISCHILDNODE(ace, acl)
    where ace.[rep:principalName] = "kevin"

The results will look like (in table form):
Result Node-pathresource.jcr:createdByresource.jcr:createdresource.jcr:primaryTypeace.rep:globace.rep:nodePathace.rep:principalNameace.jcr:primaryType
/files/examplejack-admin2014-05-06T07:08:09.100+11:00nt:folderkevinrep:GrantACE

Note that 'Result Node-path' - full path to the resource - won't (and can't) be a a "column" Value in the result javax.jcr.query.Row items but can be obtained via a javax.jcr.Node item referenced by the javax.jcr.query.Row. Also note that the Principal reference here is via the 'name' as stored in the protected rep:principalName field - even if using the default security workspace, this won't be the full 'principal path' (path to the org.apache.jackrabbit.api.security.principal.ItemBasedPrincipal).

Notes:
The stackoverflow [jcr-sql2] info page includes a pretty neat summary of what JCR-SQL2 is with links to the reference specs and implementations (note the above is specific to jackrabbit). The JCR 2.0 SQL-2 Grammar diagrams are particularly useful.