This page is intended to discuss the topic of "Groups versus Privilege Management and how to determine which is appropriate for your task".

  • No labels


  1. Based on our experience with the Roles Database at MIT, its principles (originally from Scott Thorne in the mid-1990s and refined over time as we've gained experience with the system), I've written down some considerations for why a 3-part privilege management model would be appropriate for controling access within an application.  I've also tried to list reasons why a group model might be appropriate, though I may not be a very good advocate for access control based on group membership.  I have not included any text about privilege management systems using a different model, and have not written down anything about rules-based access.

    However, even within a rules-based system, it is worthwhile considering how you represent privileges (groups, 3-part model, something else), i.e., what is the generated result of the application of a rule?

    I hope this document serves as a good starting point for our discussions.

        Jim Repa

    - - - - - - - -

    I. What are the two models to be considered?
    Let's consider two models for representing privileges or permissions:
      (1) A 3-part "Authorization", where the three parts are
          (Subject, Verb, Object), or in MIT's Roles Database terminology
          (Person, Function, Qualifier).
          In this model, a permission is represented by a triplet that says
          that a given Person (or agent) is authorized to perform a
          given Function (activity or transaction) for a given Qualifier
          (a unit or branch of a tree that represents the organizational,
          financial, or other unit or area for which the person is allowed
          to perform the given Function).
      (2) Group membership, a 2-part representation
          In this model we have a tuple that says a given Person (or agent)
          is a member of a given Group.  There must be a mapping or string
          parsing rule somewhere that ties group membership to the functions
          or transactions that group members are allowed to perform.
    II. What are the implications of the 3-part Authorization model?
    Since the group model is probably more familiar to most people than the 3-part MIT Roles Database style authorization, let's look closer at the implications of the 3-part Authorization model.  In a 3-part Authorization, where the parts are (subject, verb, object) or synonymously (Person, Function, Qualifier), the verb or Function component represents some activity or transaction that the user can perform.  It could be a high-level role that represents more than one kind of transaction, or it could be a discrete transaction, but it should be something that can be easily articulated in business terminology.  The object or Qualifier component is very important in any decentralized institution where activities and privileges are distributed to individuals based on organizational, academic, or financial divisions.  The verb or Function represents a common activity, and the object or Qualifier indicates where the authorized person is allowed o perform this activity.
    For example, if the Function is "Create a requisition to spend money", then the Qualifier, representing the area where the person can perform the Function, will be an account number or a branch within the tree of account numbers.   If the Function is "Enter grades", then the Qualifier could be an individual section within a course, or it could be a branch within the tree of courses that might represent all sections of all courses within the department of Biology.
    There might be multiple Functions that could be performed against a given type of object.  For example, for academic courses, Functions might be "Enter grades", "Schedule final exams", "View grades", "Schedule sections", etc..  Any Authorization to perform one of these Functions is not complete without the Qualifier, which must be a course, section within a course, or a branch within the tree of courses.
    By specifying Authorizations in the form (Person, Function, Qualifier), we provide a clear and versatile way for (1) representing Authorizations within a database, (2) asking questions about Authorizations within an application, and (3) talking about Authorizations among people.  This representation reduces ambiguity within computer software and among people.  It is not sufficient to say "Please give Joe access to Biology 101".  This statement gives you the Person (Joe) and the Qualifier (Biology 101), but omits the Function.  So, you would respond to such a statement by asking "What is the Function that you want Joe to be able to perform for Biology 101?  Is it "Enter grades", "View grades", "Schedule final exams", etc.?
    Similarly, it is not sufficient to say "Please allow Joe to Enter grades."  We've got the Person and the Function, but we're missing the Qualifier.  The appropriate response would be to ask "Allow Joe to Enter grades for what - for what course, or what branch of the tree of courses do you want him to be able to enter grades?"
    Of course it is possible to represent these sorts of permissions using Groups.  For example, you could create lots of Groups - the cross product of all course-related Functions and all sections within courses.  You might have group names such as the following:
    You then could then make people members of the appropriate groups to give them the appropriate privileges.  You could also use groups of groups to aggregate the fine-level permissions.  (But since you've forced a [Function, Qualifier] pair to be represented by
    a single object [a group], you make it a harder problem to aggregate the objects in a natural and practical way.  Having a tree of Qualifiers and a separate tree of Functions gives you
    a more natural way of representing the relationships between these entities.)
    III. What are advantages and disadvantages of each model?
    Let's try to list advantages of each model.
      A. Advantages of group model
         1. Some software packages are designed to organize
            privileges into groups
         2. Some software tools, e.g., LDAP or Grouper, do a
            good job of supporting groups
         3. Some permissions within applications are
            well-represented by groups, particularly where the
            application is not designed to support distribution
            of responsibilities.  If you can either enter grades
            for all courses or for no courses because the
            application is designed that way and does not allow
            privileges limited by course or section, then a single
            group "ENTER_GRADES" is sufficient to represent that
         4. Some people find the group model comfortable and
            easy to understand
      B. Advantages of 3-part Authorization model
         1. The 3-part model is a more natural and effective
            way to represent privileges where responsibilities
            and privileges are distributed within a large
         2. The 3-part model prompts people to talk about
            privileges in a clearer way, e.g.,
            "Allow Joe (Person) to Enter grades (Function)
            for course Biology 101 (Qualifier)" rather
            than "Put Joe in group XYZ", or
            "Give Joe access to Biology 101" - but what
            do you mean by "access"?
         3. The 3-part model provides a useful way to describe
            privileges within an application - even if there is
            a forced translation from a group-based system on
            the outside of the application
         4. The 3-part model better accommodates fine grained
            access in many cases.  If there are 10 Functions that
            can be performed on any of 300,000 account numbers,
            then you would need to create 3,000,000 groups to
            represent all of the combinations.  And if you tried
            to use groups of groups to aggregate sets, it would
            be a harder problem than aggregating Functions and
            Qualifiers separately.
         5. The use of a component called a Function encourages
            business analysts and developers to define Functions
            in understandable business terminology.  It is harder
            to name groups to represent understandable
            business terminology when a group must represent
            both a Function and Qualifier at the same time
         6. A group-based system requires more out-of-band
            information to map groups into privileges to be
            enforced within an application than system
            modeled on (Person, Function, Qualifier) triplets.
            Thus, there is greater danger that those maintaining
            group membership will not understand the ramifications
            of adding a person to a group.  Similarly, a
            developer who changes the rules for how groups are
            interpreted within an application may cause
            consequences not foreseen by the maintainers of
            the groups.  This problem could happen with a 3-part
            authorization model, but it is less likely to happen.
         7. With the Function component, there is often a simple
            (and obvious) mapping between the Function and a
            subroutine or method within a piece of software. Only
            in simple, applications that disallow distributed
            responsibility will there be such a simple mapping
            between a group and a subroutine or method.

  2. First, I absolutely agree that the traditional two-part authz model of Subject + Group falls apart in a pluggable LMS/CLE.

    Based on my experience in Sakai, I usually speak of a different three-part model: Subject + Group + Role. My guess is that many scenarios can be translated between the two schemes. (For example, on first reading about MIT's approach, I thought that a "group's membership list" could probably be modeled as "the list of all members who have any functions beginning at a particular qualifier node." I can see definite advantages to MIT's terminology, however.

    1. I wonder how the Subject + Group + Role model under Sakai could be used to handle the following use case:
         Joe (can) Enter Grades (for) Biology 101

      Clearly, the subject would be Joe.  Would the Group be "Biology 101" and the Role be "Enter Grades" (or some higher-level aggregate of functions that includes "Enter Grades") ?  In other words, would we have the following mapping?
         Roles DB "Subject" -> Sakai "Subject"
         Roles DB "Function" -> Sakai "Role"
         Roles DB "Qualifier" -> Sakai "Group"
      In the Sakai model, can a Subject have more than one Role for the same Group?

      1. You may have figured this out from my following comment, but the mapping would be:

        Joe has the role "Head GSI" in "Biology 101"...
        ... and "Head GSI" has the permission "gradebook.gradeAll".

        Where the MIT model is far better is in directly allowing qualifiers other than group. For example:

        Joe has the role "Editor" in "Online Biology Textbook"

        In a group-based model, we have to create an otherwise unneeded object just to be able to give multiple people special access to a resource.

  3. It is not sufficient to say "Please give Joe access to Biology 101". This statement gives you the Person (Joe) and the Qualifier (Biology 101), but omits the Function. So, you would respond to such a statement by asking "What is the Function that you want Joe to be able to perform for Biology 101? Is it "Enter grades", "View grades", "Schedule final exams", etc.?

    This is where we get into Sakai's split between installation-defined Role and application-defined Permission. (In real life, well-designed externalized application permissions actually end up looking more like roles, but I'll use our current jargon for now.)

    Let's say that you have an LMS/CLE like Sakai with, oh, close to a hundred different plug-in applications and services provided by institutions over the globe. One of them might be an Online Gradebook with functions like "Enter all grades", "Enter grades for assigned sections", and "Be gradable". Another is a Registrar Integration Plugin with functions like "Report final grades" and "Revise final grades". Another is an Online Assessment Engine with functions like "Schedule exam time" and "View grader comments". Anyway, each application is likely to need to expose multiple functions. And every time we install a new plug-in (and often when we upgrade an existing plug-in), we're likely to get a new vocabulary of functions.

    We can't really ask instructors, researchers, and students (or even long-suffering administrators) to one-by-one assign hundreds of functions to thousands of subjects or qualifiers. Just to reduce cognitive load, we need to bundle function mappings into larger cross-application units. In MIT terms, I suspect this might be accomplished with a Function hierarchy: "Joe" has the function "Student" in "Biology-101-b", and the function "Student" has child functions like "Be gradable" and "View grader comments".

    One justification for our treating the "Role" function separately is that it's configured separately. Not all institutions or installations will have the same roles, and the list of available roles may change over time. But that may not be any more of a problem than the fact that different Qualifiers are being configured by different authorities (e.g., the qualifier "Biology-101-b" is obtained from a different source than the qualifier "Payroll Department"). I'll have to try the MIT approach out in a few more scenarios to get an idea.

    1. I agree that it is important to define Functions (or Roles) at a high level where possible to avoid requiring administrators, etc., to enter authorization information at an unnecessarily detailed level. Sometimes it is necessary to also provide the option of granting authorizations at a more detailed level as well, for individual cases. It is also important to be able to get data  about people (e.g., Joe is a student in Biology-101-a) from central sources so they can be used for rules-based authorizations without having to re-key the data. (For the record, MIT's Roles Database does have some data about people  drawn from other systems and used for evaluating authorization rules, but we do not currently have data about student enrollment.)
      You mentioned that Qualifiers "Biology-101-b" would be obtained from a different source than "Payroll Department".  This is true, but not a problem, as these are different types of Qualifiers that sit in different hierarchies.  "Biology-101-b" is a leaf in the hierarchy of academic schools, departments, courses, and sections, whereas "Payroll Department" would represented in the HR-org unit hierarchy.
      An Authorization related to courses and sections would have a Qualifier within the academic schools/departments/courses/sections hierarchy, and an Authorization related to HR or Payroll Functions would have a Qualifier from the HR-org unit hierarchy.
      (Incidentally, MIT has a Master Department Hierarchy (MDH) that has an uber-org-unit-hierarchy, and each of these units is then linked to appropriate  HR, financial, and academic objects.  The MDH accommodates the differences in organization between the different application areas (it doesn't insist that HR and Financials and the Registrar all have identical orgs), but by linking objects to uber-org-units, it accommodates cross-domain reporting in our Data Warehouse, and it supports Authorizations in the Roles Database that span application areas that have different org charts.  This would be a topic for a whole different discussion.)

  4. I like the subject verb object concept, and I will keep that in mind when we work on privilege management enhancements to Grouper.

    If you were modeling the above case in Grouper as a group management system, you could use group "lists" as the verb.  We do that right now with privileges internal to grouper.  You can admin, read, update, view, etc a group.  We have separate lists on the group where we add people or groups who are allowed.  The gap from the above case is that it isnt possible to make a group which consists of another group's non-member list, so you can't make a collection of collections.  Not that you would want to do something like that anyway, but just mentioning that Grouper is a 3 part model already, and we will try to maintain that with the upcoming "attribute framework" which can support privileges.

    Question though, when you define a qualifier in the roles DB, is that a strict hierarchy (you mention branch of tree, but I dont recall more information).  i.e. if you define a qualifier of math-261, that might be crosslisted inside several parents, can you group qualifiers under multiple parents or only one?  In Penn's local authorization system, we do have a strict hierarchy or qualifiers, and it has been limiting.  e.g. the set of screens a professor role can view might overlap the screens an advisor role can view.

    1. It is not a strict hierarchy or tree that organizes Qualifiers.  A Qualifier can have more than one parent.  We do not allow cyclical connections, i.e., a Qualifier cannot have itself as a parent, grandparent, great-grandparent, etc..
      I'd like to ask a follow-up question about how Grouper works as a 3-part model.

      When you were talking about group "lists" as a verb, I think you were saying you can have a Group XYZ and then separate groups  XYZ-admin, XYZ-read, XYZ-update whose members are allowed to perform related transactions for Group XYZ.
      That means that the set of groups you need is the cross-product of all of the Objects and all of the applicable Verbs.  For example, if you've got 1000 academic courses and 10 functions you can perform against a given course, then you need 10,000 groups in order to be able to control who can do exactly what functions against what courses.   Is that what you meant when you said Grouper uses a 3-part model, or am I missing something?

  5. No, a group in Grouper does not actually just have members.  It has members assigned to a list.  (of course, there is a default list called "members", so if it is not mentioned, it means the "members" list).  There are also types of lists: there is an "access" type of list (Grouper builtin privileges which every group automatically has: admin, update, read, view, optin, optout), and there is the "lists" type, which has "members" (default), and any custom type of list an institution wants to invent.  So if you've got 1000 academic courses, and 10 functions, you could have 1000 group objects, and 10 lists.  You assign a member to a list of a group.  3 part: member, list, group.  However, as I said before, there are some limitations.  We have a similar hierarchy-like structure that you do, we allow a group to be in more than one other group, though we do allow cyclical connections.  Make sense?

    I know in the call we mentioned that Grouper is not a vanilla flavored group management system, but I would be more comfortable if this discussion was not "Groups verses Privilege Management", but rather: "2 part verses 3 part privilege management".  This is because Group management can be three part, and privilege management can be 2 part (e.g. Penn's privilege management system just has assignments with no verbs).  Maybe we need to define with 3-part means: in this case it means how many types of objects are involved in an assignment.  Also, I wonder if the magic number is 3 in 3-part.  Maybe it should be N-part?  Im sure 3-part satisfies the 80/20 rule, though maybe complex cases would benefit from more than 3 parts.

  6. So far the biggest conceptual gap I find when applying the perMIT model to my use cases is "Subject" being restricted to "some specific entity that can be authenticated". I understand the justification, but it seems to block rule-based dynamically-resolved integrations with external systems:

    • "Any student that LDAP attributes show is officially enrolled in Psyc 202" "Takes" "Course Evaluation Poll"
    • "Anyone that a Shibboleth attribute identifies as a biochemist" "Edits" "BiochemistryWiki"

    Such conditions might not be resolvable until after authentication. It's likely that a authz federation service would then cache the more A-spec-like "JaneQResearcher Edits BiochemistryWiki". But it can't take over management of that function-grant if the condition can change between user logins.

  7. Yes, I think Groups of subjects need to be able to be assigned privileges.  Then you setup groups which are kept in sync with your attributes or are driven dynamically.

  8. Ray Davis said:
    So far the biggest conceptual gap I find when applying the perMIT model to my use cases is "Subject" being restricted to "some specific entity that can be authenticated". I understand the justification, but it seems to block rule-based dynamically-resolved integrations with external systems:

      • "Any student that LDAP attributes show is officially enrolled in Psyc 202" "Takes" "Course Evaluation Poll"
      • "Anyone that a Shibboleth attribute identifies as a biochemist" "Edits" "BiochemistryWiki"
        Such conditions might not be resolvable until after authentication. It's likely that a authz federation service would then cache the more A-spec-like "JaneQResearcher Edits BiochemistryWiki". But it can't take over management of that function-grant if the condition can change between user logins.

    If you take a step back and think of perMIT in terms of XACML's models, Jim's current write up covers the data model that can be exposed in PIP, PEP, and PDP. However, perMIT also has what we call implied authorizations, which I think more closely maps to the PAP area.

    The Implied Authorizations portion of perMIT allows one to write a set of rules which will evaluate data from other sources and then populate the perMIT ASPECs.

    This means that you could write a rule that would use the LDAP attributes to create the ASPEC for each student which would allow them to perform the course evaluation for Psyc 202.

    It turns out that at MIT we would not actually use LDAP to do this. Instead we can do things a bit more efficiently. The registrar feeds data into our Data Warehouse about which student is registered from which course, from their point of view. We could then write a rule to evaluate that data. However, the registrar's data is not 100 percent authoritative when it comes to who can perform a course evaluation. An individual instructor or departmental AO may wish to grant some other people the privilege as well. 

    For example, faculty members or departments may know that a student's paperwork with the registrar is held up past the semester. The faculty member or department may want a particular student to be able to perform the course evaluation poll even if the registrar has not yet determined that the student was enrolled for the course. A departmental authorizer could then explicitly grant additional people the privilege necessary to perform the course evaluation. Or, if you felt that your LMS had the necessary data, your could write another rule based on the data within the LMS.

    This gives us a flexible mechanism that doesn't end up bogged down in the politics of deciding who can set an attribute in the LDAP directory that indicates that someone is registered for "Psyc 202". We feel it is best to avoid having to "lie" about an attribute in order to have a side effect of granting a needed privilege.

    Note that the audit trails within perMIT would also be able to show how the user was granted the privilege. You could get a report that shows all of the people that have the function "perform course evaluation" with a qualifier or scope of "Psyc 202, Spring 2009". The report would show who had been assigned this privilege via a rule, and who had been assigned this privilege by an explicit grant, and who had done the grant.

    Now let's look at the other use case:

    • "Anyone that a Shibboleth attribute identifies as a biochemist" "Edits" "BiochemistryWiki"
      Such conditions might not be resolvable until after authentication. It's likely that a authz federation service would then cache the more A-spec-like "JaneQResearcher Edits BiochemistryWiki". But it can't take over management of that function-grant if the condition can change between user logins.

    It's true that perMIT does not readily accommodate this model. But maybe that's a good thing. I'll assume that since you are using Shibboleth authentication to the wiki, that you're also talking about using this within the context of a federation, with a variety of institutional IdPs involved. That  makes it difficult to get everyone to agree on a controlled vocabulary that could be used to automate the authorization management as cleanly as you envision.

    If you look at the OpenWetare, they are using a number of different wiki servers deployed around the world. It turns out that that OpenWetware community doesn't just consist of biochemists, it also include biologists, computational biologists, bioengineers, research specialists, undergraduate students, and others. They don't want just any "biochemist" to have open ended editing privileges in their wikis.

    They operate much more like a gated community. They want to know more about you before you are let through the gate, and the privilege management is evolving to be somewhat complex. The wikis contain valuable intellectual property and they need to control access at fine granularity with transparent auditing.

    Creating such an environment based on simple assertions from a worldwide community of IdPs is not practical. I also don't believe it is even desirable. Instead some of the OpenWetware administrators desire a model that is much more like that used by today. A should first authenticate, and can browse the same pages that someone can access anonymously. But once authenticated a user profile is created. Subsequently an administrator can grant that profile the necessary privileges.

    If someone were to apply perMIT to the wiki management using this model there would be a number of functions: page read, page edit,  page remove, page export, comment read, comment remove, attachment create, attachment remove, mail remove, space export, space admin, ...

    The qualifiers, or scope, would then be the hierarchy of spaces and individual pages and documents.

    1. Ah, excellent – I hadn't understood the "implied authorization" piece from what I'd read (although it's nicely brought out in your ACAMP Background Material), and it certainly fits the use case.

  9. At Penn we have an automatic ANONYMOUS role, and an AUTHENTICATED role.  We assign permissions to those roles.  (e.g. ANONYMOUS users can see the splash screen, AUTHENTICATED users can do directory searches).  We would not know which people have which permission when in these roles, but we also do not have to expand a grant to AUTHENTICATED to 400k rules for the number of people with netId's, and keep them in sync as people are given netId's or taken away.  Same could go for activePerson, activeEmployee, etc...  I think role based assignments can be useful in privilege management.

    I will also mention that Grouper currently stores effective memberships (e.g. memberships that exist because a group is a member of another group) as a record in a table for each effective membership, which is similar to what you are suggesting (row per assignment to group member).  Generally it is a win-win, always quick reads, mostly quick writes.  However, when there is a course list, and there is one place to add an admin to the courses, and there are thousands of courses, then adding a member to that group suffers from slow performance as those thousands of records are written.  And if you want it to be transactional, then you are waiting for those records when you insert one immediate membership.  So if you assign a privilege to an active employee at Penn, and that generates 20k records in a table, I think there might be a similar performance issue.  Grouper 1.5 is removing the row-per-effective-membership design, and privileges in Grouper will not have a row per member for role based assignments.  I think point in time auditing will be complex, and finding who has a privilege will be complex.  However, I think the performance is very important in reads, somewhat important in writes, and less important for audit reports of role based assignments involving large groups or complex hierarchies.