Optimalized Grouper provisioning

We wanted fast way to import large number of groups and memberships. GSH is very slow. Using plain Grouper API with some logic for incremental changes is better, but still not good enough. So we develop simple app which makes changes in Grouper database with plain JDBC and uses batch updates.

Simple benchmark

We did simple benchmark where we compare our custom provisioning with provisioning based on Grouper API.
It was tested on Intel Xeon 5160 (4 cores, 3 GHz) with 8 GB RAM and two SAS disks in raid.
OS: customized Debian based on Lenny
RDBMS: Postgres 8.3

Provisioning app was started on the same server. Testing data contains 200 groups with 183000 memberships. Size of groups ranged from 1 to 20000.
Our solution imports this data in 759s.

How it works

Our importer reads input file (currently it is outpot generated by Grouper funnel, but we plan to develop more verbose xml file with more options). The name of input file corresponds with name of stem to which will be imported groups and theirs members. If stem doesn't exist application creates it.
At start it will fetch all subjects from registered sources, so it doesn't need to query for every imported subject every time.
It is incremental importer, so it fetches current status of groups and their members in stem, and compares it with data from input file. This "diff" runs in several threads. After

grouper-importer.zip - contains binary and source code distribution of our custom provisioning tool
trg_compute_insert_membership.sql - - trigger to compute effective and composite memberships after membership insert
trg_compute_delete_membership.sql - trigger to compute effective and composite memberships after membership delete
grouper_memberships_seq.sql - sequence whis is used to generate effective and composite memberships id

Page tree

Optimalized Grouper provisioning

Simple benchmark

How it works