This is Penn's experience implementing the Grouper organization hierarchy. Note, we used to put the fully qualified org name in the group name, but now we have the group name as the org extension without the ancestor path.
This is the size of our org implementation:
- 27,000 people in orgs at Penn
- 2,200 orgs (700 leaf nodes, 1500 rollup nodes)
- 3,000 org groups (more due to include/exclude lists)
- 500,000 org memberships (there are a lot due to the rollups, and include/exclude lists)
- Org loader: 1 minute 20 seconds (after the first load), which equates to 630 groups and 24000 immediate memberships per minute
- Rollup loader: 1 minute 40 seconds (after first load), which equates to 1000 groups per minute
- Center loader: 40 seconds
- Consultant loader (currently one group managed): 4 seconds
- We have a view of orgs and parent id's, and a view of assignments of org to person
- There are a few loader jobs:
- One that manages all the leaf node orgs groups (assigns people to org groups)
- One that manages all the non-leaf rollup nodes (assigns orgs to parent org groups)
- One that manages Penn "centers" which are our high level organizations (assigns direct center children orgs to center groups)
- One that manages contractor groups. We dont have contractors assigned in our org structure, so we manage them through attributes in our person database. Then we can manually make other groups next to the org rollup which includes the org rollup and the contractor group
- We will run all these jobs once daily after our payroll jobs run
- Grouper include/exclude automatically creates include and exclude lists which are tacked on to the system of record group. This is useful in loader jobs since the loader system of record list is resynced with the DB query with each run, so manual changes will be undone. Penn actually only needs "include" lists, not exclude at this point (use case is VP's are in a different org than the org they manage, so we want to add them in). So we will have include/exclude groups on the rollup groups, the centers groups, and the contractor groups. We dont think we will need them on the leaf nodes, though if we have a need later, we can add it in. The reason not to do this is performance, it creates a bunch more groups and memberships.
Here are the steps to creating a loader SQL_GROUP_LIST job:
- First of all, you should prefer using views, and simply put simple select from the view in the loader config. It is also easy to tell what the loader is going to do without hunting through the loader configs, and easy to make changes at runtime (though I believe the loader allows runtime query changes as well)
- Note: if you are using include/exclude, then the group names should have the system of record suffix which is configured in the grouper.properties
- Make a view of groups where each row represents a group.
- There is a col for the group name, display name (optional), description (optional), and security (e.g. readers, viewers, etc: optional)
- This view will set these attributes of group and auto-create groups which have no members (might be useful for orgs since apps can refer to groups which have no members)
- Note: if you can give groups a unique suffix in the stem structure, then the job can use the setting "grouperLoaderGroupsLike" which will delete groups which are no longer in the group list
- Make the query which returns the subjectId (and sourceId if not the default loader source), and group name
- For leaf nodes, this is generally a simple query that assigns people to org groups
- For rollup nodes, this can be a little complex.
- First of all, you might union the direct rollup children with the direct rollup leaf nodes
- The subjectId for groups is the group_id. for Grouper 1.4, you can join to the grouper_attributes table. For 1.5 you can simply join to the grouper_groups table. In both, you can join to grouper_groups_v if you like. What I did for 1.4 is: grouper_attributes ga, grouper_fields gf where gf.NAME = 'name' AND gf.ID = ga.field_id and ga.VALUE = ocrv.MEMBER_GROUP_NAME. I would definitely keep these query in a view.
- You should also specify the source id for groups: 'g:gsa' as SUBJECT_SOURCE_ID
- Configure the config group for each loader job. I put this next to the top level loaded stem. I generally do this in GSH, though you could also do this in the UI
- Kick off the loader job manually in GSH so you can verify the results without waiting for the cron to run
- Restart your loader so it picks up the new job
Person orgs (leaf nodes)
- Lets make a function which strips out special chars:
* Implement the view of orgs. Note, in the view you can easily use unions in the sql to end the hierarchy at a different node. At first we were going to do this, then we decided against it. However, you will see that we do filter out certain branches and nodes. Also note we shorten the names of some top level nodes.
- This shows the following data (2200 rows)
- Here is the view which assigns people to orgs. This view makes sure the person has at least one active job in that org (doesnt have to be the primary job)
- This will give the following data (cleansed). The penn_id is the subject_id of the person
- Make a view about the person org metadata
* This data looks like this
* Penn has two types of orgs: orgs that hold people, and orgs that dont (they hold other orgs). So I will create them separately, here is a view of orgs that hold people, that the loader will use. Note: these will not be include/exclude since we only need that on the higher level groups (rollups). Note, each group here should end in _personorg so we can know which groups are managed by this loader process.
- Add the config group. Note, there are no org members yet (1=0), so I can inspect the grouperorgs_hierarchical table
* Inspect the grouperorgs_hierarchy table. Note, there were some problems, so we adding the function to strip bad chars and trim the data... also adjust the org_loader_person_v (so the names and everything are ok). Here is what the org_loader_person_v looks like (person data scrubbed)
* Add the group query, and fix the member query (take out 1=0)
- This created 736 org groups with 33k members