We wanted fast way to import large number of groups and memberships. GSH is very slow. Using plain Grouper API with some logic for incremental changes is better, but still not good enough. So we develop simple app which makes changes in Grouper database with plain JDBC and uses batch updates.

Simple benchmark

We did simple benchmark where we compare our custom provisioning with provisioning based on Grouper API.
It was tested on Intel Xeon 5160 (4 cores, 3 GHz) with 8 GB RAM and two SAS disks in raid.
OS: customized Debian based on Lenny
RDBMS: Postgres 8.3

Provisioning app was started on the same server. Testing data contains 200 groups with 183000 memberships. Size of groups ranged from 1 to 20000.
Our solution imports this data in 759s.

How it works

Our importer reads input file (currently it is outpot generated by Grouper funnel, but we plan to develop more verbose xml file with more options). The name of input file corresponds with name of stem to which will be imported groups and theirs members. If stem doesn't exist application creates it.
At start it will fetch all subjects from registered sources, so it doesn't need to query for every imported subject every time.
It is incremental importer, so it fetches current status of groups and their members in stem, and compares it with data from input file. This "diff" runs in several threads. After

  • No labels