Building the Grouper API
The Grouper API is provided both as a binary and source distribution. Note the Grouper Installer will install the binary grouper API.
To build the source distribution :
Testing the API is performed using GrouperShell :
Testing will destroy any pre-existing data in the Groups Registry database
Configuring the Grouper API
This section describes all of the Grouper API configuration files and important settings.
The Grouper API is distributed with example configuration files with ".example" inserted in the middle of their names. These should be renamed or copied to remove the ".example" substring, or doing a build with ant will do this, or it is already copied in the binary distribution. e.g. for grouper.properties, the example file is grouper.example.properties.
integrating the Grouper API with the database that will house your Groups Registry
integrating the Grouper API with chosen identity sources
defaults for Grouper privileges, enabling identified external users to act with elevated root-like privilege, changing the display name for internal subjects
auto-load memberships from external sql sources, register notification consumers, validate Grouper Rules, update enabled/disabled flags, etc
Database-Related Settings and Procedures
Database Driver Location
Place the jar file containing the JDBC driver for your database in the lib/custom/ directory. The Grouper v1.5.0 package includes the JDBC driver for HSQLDB v220.127.116.11. Sample JDBC drivers are located in lib/jdbcSample (e.g. for Oracle, MySQL, and PostgreSQL).
General Property Settings
Grouper uses Hibernate to persist objects in the Groups Registry. Database-specific settings are configured in conf/grouper.hibernate.properties, which has pre-populated examples for HSQLDB, MySQL, Oracle, and Postgresql.
Required properties are:
JDBC driver classname
JDBC URL for the database
database user's password Note, you can also put a filename of the encrypted password
classname of a Hibernate dialect, for setting platform specific features. Choices are listed (Hibernate Reference Documentation - Chapter 3. Configuration - 3.3. JDBC connections - 3.4.1. SQL Dialects) here
You may need to refer to your database support person to determine these required properties.
Detailed Hibernate configuration documentation is available (Hibernate Reference Documentation - Chapter 3. Configuration - 3.3. JDBC connections) here.
MySQL Transaction Support
If you want transactions to work (i.e. when doing a unit of work in grouper, it either all completes or none), which is definitely recommended, though not required, your mysql table format needs to be transactional, e.g. innodb, which is not the default (myisam is the default). One way to enable innodb in mysql is with this line in the my.cnf: default-storage-engine=innodb
MySQL/MariaDB Character Set and Performance Settings
For MySQL and MariaDB, the following settings are recommended.
utf8mb4 is not recommended because the database engines in MySQL/MariaDB that support transactions have limitations on the length of prefixes for indices. For InnoDB and XtraDB, the prefix is limited to 767 bytes. When using utf8mb4, where 4 bytes are used instead of 2, the prefixes used for some Grouper tables are too long.
You can also consider setting innodb_flush_log_at_trx_commit = 2. The default setting of 1 is required for full ACID compliance and logs are written and flushed to disk at each transaction commit. However, this can be expensive in terms of disk I/O. With a setting of 2, logs are written after each transaction commit and flushed to disk once per second. Transactions for which logs have not been flushed can be lost in a crash. The tradeoff here may be acceptable in your environment.
For those running MariaDB you should read this knowledge base article about OPTIMIZE and defragmenting. Some have found MariaDB 10.3+ to be a good, fast variant of MySQL for use by Grouper.
Database Allow changes and Deny
Some database operations (such as dropping tables or recreating data during tests) require confirmation of a prompt asking whether or not to continue. It is possible to automatically allow or deny these database operations in conf/grouper.properties :
Database Initialization Procedure
Database initialization is performed using the GrouperShell.
Initializing the database will destory any pre-existing data
To initialize the Groups Registry and install tables, populate default group types and fields, and create the root naming stem :
To re-initialize the Groups Registry (e.g. after running junit tests) :
To see all options :
Analyzing Tables to Improve Query Performance
Whenever a lot of changes are made to the data in the Groups Registry database (including upgrades of Grouper), you should analyze your database tables to improve query performance.
MySQL Syntax: ANALYZE TABLE table_name
PostgreSQL Syntax: ANALYZE table_name
Oracle Syntax: exec dbms_stats.gather_table_stats('schema', 'table_name', cascade => TRUE);
Or this: EXEC DBMS_STATS.gather_schema_stats('schema');
In all cases, substitute "table_name" with each table that you want to have analyzed. For Oracle, also substitue "schema" with the database schema for your Groups Registry.
Improving queries using histogram statistics
Even with a full set of statistics on tables, columns, and indexes, this is sometimes not enough information for some queries. For example, in a database with 100,000 groups and 100,000 users, a query plan based on memberships may think that there will likely be at most one group per member. So the query plan may be built on the assumption that it can safely do a Nested Loop iteration through the few rows returned. But it is a plausible example that the GrouperAll subject is granted read access to a large number of these groups. This could have an effect on queries for non-wheel users when checking whether the logged in user can read a group. Instead of looping through a few rows, it could be looping through thousands.
With database histograms, values are put into a fixed number of bins. If the column data is heavily skewed toward one value, that value will occupy one or more bins by itself, and the query analysis can use that information to get a rough estimate on the cardinality of a filter on that column.
With Oracle, a first step toward improving these queries is to add a histogram for a single column, e.g.:
Histograms on more than one column require an extended version of this.
MySQL starting from version 8 has histograms, probably similar to Oracle. https://mysqlserverteam.com/histogram-statistics-in-mysql/ .
Configuration of Source Adapters
Grouper uses Subject API compliant "source adapters" to integrate with external identity stores. "Subjects" are the objects housed there that are presented to Grouper for management vis-à-vis group membership and Grouper privileges. These may represent people, other groups, computers, applications, services, most anything for which you manage identity. With the exception of Grouper groups, Grouper treats all subjects opaquely. See the Subject API documentation for further background and details concerning subjects, source adapters, and other aspects of the Subject API.
Each source adapter connects with a single back-end store using JDBC or JNDI. Grouper makes no specific assumptions about the schema of any subject types. Instead, sections of the configuration file, grouper/conf/sources.xml, declare the details of how to connect with each back-end store, the identifier(s) to be used for the subjects it contains, how to select and search for subjects, and which subject attributes should be made available to Grouper.
Three types of source adapters are included in the Grouper API v1.5.0 package. JDBCSourceAdapter and JNDISourceAdapter classes are included in subject.jar, and GrouperSourceAdapter is built along with the Grouper API. Every Grouper API deployment MUST include a *source* element in grouper/conf/sources.xml for the GrouperSourceAdapter so that Grouper can refer to its own groups in the same manner as other subjects.
JDBC and JNDI sources have two options each. For JDBC, if you can make a table/view where each subject is represented as one row of the view, then the more powerful GrouperJdbcSourceAdapter2. One of the major advantages is that if you enter in a phrase in the subject search, e.g. "John Smith", then it will search for records which have John and Smith in them (case insensitive), whereas the GrouperJdbcSourceAdapter will look for the whole string "John Smith" and will not return a record for "John L Smith". For JNDI, UW contributed a source adapter which should give better performance.
As of Grouper 2.0, Grouper stores additional data about subjects that are used by Grouper to search and sort a list of members. Each source must be configured for at least one search attribute and one sort attribute.
See the sources.example.xml for example usages of the sources.xml
Note that in 1.5.0 the subject API changed, so if you have custom subject sources you will need to tweak and recompile them.
Choosing Identifiers for Subjects
Identifiers and their management can get complicated. They can be revoked or not, re-assigned or not, lucent or opaque, etc. Depending on such characteristics, a given identifier might be a good or bad choice to use in the context of managing the identified subject's group memberships.
For example, a username is often lucent - easily remembered by the person to whom it is associated. But it may also be revokable, meaning that it no longer refers to that person (perhaps they have a new one), or even re-assignable, meaning that it might refer to some other person at a later time. If a username is used to record membership, username changes must trigger corresponding membership changes. A username is better suited to authentication than it is to indicating membership.
On the other hand, an opaque registryID (machine, not human, readable) that never changes is great for membership, but lousy for authentication - it might not even be known by the person to whom it is associated. How would I identify myself to Grouper if I wished to opt-in to a list or manage a group?
Grouper accommodates subject identifier issues in two ways. First, it maintains UUIDs for every subject and group within the Groups Registry. These are never exposed by the API, but are associated with externally supplied subject identifiers within the Groups Registry. This approach allows the identifier associated with a given subject to be changed without any need to change actual memberships.
Second, by relying on the Subject API, Grouper is able to lookup subjects that are presented with an identifier in one namespace and obtain identifiers in other namespaces for that subject. That means that it can translate a username into a registryID, for example. So, when a user authenticates to an application using the Grouper API, that application can use the Subject API to fetch an identifier for the person chosen by the site for use in memberships. Similarly, when a membership in the Groups Registry is to be expressed elsewhere, the identifier used for group members can be translated by a provisioning connector by use of the Subject API into one that is suitable in the provisioned context.
Subject ID: should be unchangeable, unrevokable. Usually this an opaque id (number or uuid etc). The source that a subject is associated with also should not change.
Subject Identifier: anything that can refer to a subject uniquely. Usually these are netIds, eppns, etc.
It would be nice if subject id's and identifiers are unique across sources, though this is not required.
You should not have the same subject in more than one source.
Subjects should be resolvable for as long as you want users to be able to search for them or view them on the UI. It is possible for subjects to not be active in which case they are not searchable, but still be resolvable so they can be shown in the UI in auditing.
All configuration of Grouper properties detailed in this section occur in the grouper/conf/grouper.properties file. Look in the grouper.example.properties file for the more obscure settings. Common settings are listed below.
Note that in Grouper 2.2 and above, overlay Configuration files can be used (and are recommended).
This setting describes the env that grouper is running, e.g. used in the daily report from the loader which
If Grouper should auto init the registry if not initted (i.e. insert the root stem, built in fields, etc)
If Grouper should try and detect and log configuration errors on startup, in general this should be true, unless the output is too annoying or if it is causing a problem
If groups like the wheel group should be auto-created for convenience (note: check config needs to be on)
Auto-create groups (increment the integer index), and auto-populate with users (comma separated subject ids) to bootstrap the registry on startup (note: check config needs to be on). The next group would end in 1, then 2, etc
By default, anyone with admin rights on a group can edit the types or attributes. Specify types (and related attributes) which are wheel only, or restricted to a certain group
If you don't want to be prompted for DDL changes in certain databases (e.g. dev), list them here:
Allow and deny for db data or object deletes, without prompting the user to confirm
If a listing is in the allow, it will be allowed to delete db
If a listing is in the deny, it will be denied from deleting db
Multiple inputs can be entered with .0, .1, .2, etc. These numbers must be sequential, starting with 0
There is a substantial section for include/exclude and requireGroups. These are group types which help you create composite groups to manage include/exclude lists for groups (especially useful for grouper loader privisioned groups), or groups which require memberships in other groups (e.g. activeStaff). See the grouper.example.properties file if you want to customize things, but to enable, set these:
Here are some requireGroups (increment the 0 to add more):
Hooks are ways to plugin in your own java code to affect how Grouper does its logic. You can register multiple classes for one hook base class by comma separating the hooks implementations. You can also register hooks at runtime with: GrouperHookType.addHookManual("hooks.group.class", YourSchoolGroupHooks2.class);
See the grouper.example.properties for the full list, here are two examples:
You can validate group attributes via regex (see grouper.example.properties for more info) (increment the 0 to add more)
Database structure data definition language (DDL) settings (see grouper.example.properties for full list)
Mail settings (optional, e.g. for daily report from the loader)
Grouper requires that all subjects must be explicitly granted access or naming privileges (cf. Glossary), with one caveat. There is a special "subject" internal to Grouper called the ALL subject, which is a stand-in for any subject. The ALL subject can be granted a privilege in lieu of assigning that privilege explicitly to each and every subject.
When a new group or naming stem is created, any of its associated privileges can be granted by default to the ALL subject. This is configured by a series of properties in grouper.properties, one per privilege. If a property has the value "true" then ALL is granted that privilege by default when a group or naming stem is created. Otherwise it is not, and hence no subject has that privilege by default. The groups read and view settings below are set to true to make the quickstart easier to run. If you have an deployment where privacy among Grouper users is important, you should consider changing those to false so that access to see or view memberships of groups must be explicitly assigned.
Value in Grouper v1.5 Distribution
Grouper has another special "subject" called GrouperSysAdmin that acts as a super-user. GrouperSysAdmin is permitted to do everything - the privilege system is ignored for that special subject. Grouper can be configured to consider all members of a distinguished group to be able to act as super-users, much as the "wheel" group does in BSD Unix. Two properties control this behavior:
"true" or "false" to enable or disable this capability.
The group name of the group whose members are to be considered security-equivalent to GrouperSysAdmin.
The Grouper UI enables users that belong to the wheel group to choose when to act with the privileges of GrouperSystem and when to act as their normal selves.
Changing the display name of GrouperAll and GrouperSystem
Before version 1.3.0 the Grouper UI referred to EveryEntity as GrouperAll and GrouperSysAdmin as GrouperSystem. As of version 1.3.1 the name attribute of GrouperAll and GrouperSystem can be set through the properties below.
The name to use for GrouperAll instead of EveryEntity
The name to use for GrouperSystem instead of GrouperSysAdmin
If you choose not to use the defaults you will have to update the UI nav.properties file to ensure consistency e.g. subject.privileges.from-grouperall=inherits from EveryEntity
Changing default privilege caching
Grouper includes three `PrivilegeCache` implementations:
- NoCachePrivilegeCache - No caching performed
- SimplePrivilegeCache - Caches results but flushes all cached entries upon any update
- SimpleWheelPrivilegeCache - Same as 'SimplePrivilegeCache' but with better support for using a wheel group.
The privileges.access.cache.interface and privileges.naming.cache.interface properties can be set to determine the privilege caching regimen. The default is edu.internet2.middleware.grouper.NoCachePrivilegeCache.
Using a privilege management system external to Grouper
Grouper's internal security implementation relies on two java interfaces, one for Naming Privileges and another for Access Privileges. Grouper ships with classes that implement these interfaces, but 3rd parties are free to supply their own and so manage Grouper privileges using a privilege management system external to Grouper. Two properties declare the java classes that Grouper will use to implement these interfaces:
classname of the java class that implements the Access Interface
classname of the java class that implements the Naming Interface
classname of the java class that implements the Attribute access Interface
It is not clear that this has been taken advantage of... the internal privilege management is the one usually used. Also, performance of the system will be drastically reduced if external privileges are used, since internal privilege management can join tables in one query to securely select from the registry. If you want to store privileges externally, another option is provisioning the internal access adapter settings and table data from an outside system.
Notifications / change log
To enable the change log, set this:
if we should insert records into grouper_change_log_temp when events happen
If you are using ldappc, then you need to keep updating the last membership time. If not (and not using this column for other custom reasons), you will reduce the number of queries by setting this to false.
If true, when a membership is added to a group (either a privilege or a list member),
If true, when a membership is added to a stem (this would be a naming privilege),
Logging is configured in the grouper/conf/log4j.properties configuration file. By default Grouper will write event log information to grouper/grouper_event.log, error logging to grouper/grouper-error.log, and debug logging, if enabled, to grouper/grouper-debug.log. The log4j configuration can be adjusted to control the verbosity, type and output of Grouper's logging.
The Grouper - Daemon is a daemon command line process which handles many tasks including running jobs that load/remove memberships of groups based on results from a sql query, executing change log consumers, validating Grouper Rules, and updating enabled/disabled flags. There is a grouper-loader.properties file to configure. The common settings are described below, see the grouper-loader.example.properties for descriptions of all settings.
If you want Grouper to make sure the loader type and attributes exist if not there, set this. Otherwise you need to add the type and attributes yourself with GSH.
If most of your loader queries come from one subject source, you can set the default subject source here, so your loader queries only need to return SUBJECT_ID and not the source id also:
If all of your queries are run against your grouper db credentials (e.g. if you have other schemas on the same DB to query), you dont have to configure DB connections. If you have other databases to query (e.g. an external data warehouse, etc), you can configure the db credentials here. The name here is "warehouse", use different names for different connections
If you want to use the Grouper daily report, configure in the grouper-loader.properties (and the mail settings described above in grouper.properties). This is a daily email that gets sent to you about your grouper health, including status about all the loader jobs in the last day.
Schedule when to run the enabled/disabled cron.
Manage change log consumers and also specify whether some events are written to the change log.
Schedule when to run daemon to validate rules.