You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »


In grouper 2.4 patch #46, Grouper can sync database tables.  This is intended for these use cases:

  1. Sync memberships to their own table for performance reasons (when you are doing reporting and have a lot of joins and subselects, a table without lots of joins can help a lot)
  2. Sync memberships to another database.  Other applications might not want to depend on Grouper at runtime, or have other performance reasons
  3. Sync a subject table from another database to Grouper.  This might be for performance or availability reasons
  4. Sync some table to another table.  This function is generic so use it for whatever you want (groups, attributes, etc)

Example: sync memberships to a table in the Grouper database or another database

  1. Make a view with the data you want to sync
  2. This can be based on some ad hoc groups, folders, or attributes on groups
  3. For instance this could be the view



  4. This view is defined (oracle) as:

    /* Formatted on 5/1/2019 11:27:03 AM (QP5 v5.252.13127.32847) */
    CREATE OR REPLACE FORCE VIEW PENN_MEMBERSHIPS_FEEDER_V
    (
       GROUP_ID, GROUP_NAME,LIST_NAME, LIST_TYPE, MEMBER_ID, SUBJECT_ID, SUBJECT_SOURCE
    )
       BEQUEATH DEFINER
    AS
       SELECT GROUP_ID,  group_name, list_name, GMLV.LIST_TYPE, GMLV.MEMBER_ID, GMLV.SUBJECT_ID, GMLV.SUBJECT_SOURCE
         FROM grouper_memberships_lw_v gmlv
        WHERE     (   GMLV.GROUP_NAME IN ('penn:community:student:studentThisTerm',
                                          'penn:community:student:degreePursual:degreePursual_LLM',
                                          ... )
                   OR gmlv.group_name LIKE 'penn:isc:ait:apps:atlassian:%')
              AND GMLV.SUBJECT_SOURCE = 'pennperson'
              AND list_type = 'list';
    
    
  5. Configure databases in grouper.client.properties

    grouperClient.jdbc.pcom.driver = oracle.jdbc.driver.OracleDriver
    grouperClient.jdbc.pcom.url = jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = ...
    grouperClient.jdbc.pcom.user = myuser
    grouperClient.jdbc.pcom.pass = ******
    
    grouperClient.jdbc.awsDev.driver = org.postgresql.Driver
    grouperClient.jdbc.awsDev.url = jdbc:postgresql://penn ...
    grouperClient.jdbc.awsDev.user = anotheruser
    grouperClient.jdbc.awsDev.pass = ******
    
    
  6. Configure the table sync

    # grouper client database key where copying data from
    # {valueType: "string"}
    grouperClient.syncTable.memberships.databaseFrom = pcom
    
    # if using a different schema than the connecting schema
    # {valueType: "string"}
    grouperClient.syncTable.memberships.schemaFrom = 
    
    # table or view where copying data from
    # {valueType: "string"}
    grouperClient.syncTable.memberships.tableFrom = PENN_MEMBERSHIPS_FEEDER_V
    
    # grouper client database key where copying data to
    # {valueType: "string"}
    grouperClient.syncTable.memberships.databaseTo = awsDev
    
    # if using a different schema to copy data to
    # {valueType: "string"}
    grouperClient.syncTable.memberships.schemaTo = 
    
    # table or view where copying data to
    # {valueType: "string"}
    grouperClient.syncTable.memberships.tableTo = PENN_MEMBERSHIPS_TEMP
    
    # columns must match in from and to tables, you can specify columns or do all with an asterisk
    # {valueType: "string"}
    grouperClient.syncTable.memberships.columns = *
    
    # if there is a primary key, list it, else list the composite keys
    # {valueType: "string"}
    grouperClient.syncTable.memberships.primaryKeyColumns = group_id, list_name, member_id
    
    # the grouping column is what is uniquely selected, and then batched through to get data.
    # {valueType: "string"}
    grouperClient.syncTable.memberships.groupingColumn = group_id
    
    # the grouping column is what is uniquely selected, and then batched through to get data, defaults to 10000
    # {valueType: "integer"}
    grouperClient.syncTable.memberships.groupingSize = 5
    
    
  7. Setup an other job to run this

    # Object Type Job class
    # {valueType: "class", mustExtendClass: "edu.internet2.middleware.grouper.app.loader.OtherJobBase", mustImplementInterface: "org.quartz.Job"}
    otherJob.membershipSync.class = edu.internet2.middleware.grouper.app.tableSync.TableSyncOtherJob
    
    # Object Type Job cron
    # {valueType: "string"}
    otherJob.membershipSync.quartzCron = 0 * * * * ?
    
    # this is the key in the grouper.client.properties that represents this job
    otherJob.membershipSync.grouperClientTableSyncConfigKey = memberships

Example: sync subject source

  1. Setup view.  Note, its important to have a column of type date or timestamp which is the last updated time for real time updates



  2. Setup databases in grouper.client.properties (see memberships example)
  3. Configure the table sync

    # grouper client database key where copying data from
    # {valueType: "string"}
    grouperClient.syncTable.personSource.databaseFrom = pcom
    
    # if using a different schema than the connecting schema
    # {valueType: "string"}
    grouperClient.syncTable.personSource.schemaFrom = 
    
    # table or view where copying data from
    # {valueType: "string"}
    grouperClient.syncTable.personSource.tableFrom = PERSON_SOURCE
    
    # grouper client database key where copying data to
    # {valueType: "string"}
    grouperClient.syncTable.personSource.databaseTo = awsDev
    
    # if using a different schema to copy data to
    # {valueType: "string"}
    grouperClient.syncTable.personSource.schemaTo = 
    
    # table or view where copying data to
    # {valueType: "string"}
    grouperClient.syncTable.personSource.tableTo = PERSON_SOURCE_TEMP
    
    # columns must match in from and to tables, you can specify columns or do all with an asterisk
    # {valueType: "string"}
    grouperClient.syncTable.personSource.columns = *
    
    # if there is a primary key, list it, else list the composite keys
    # {valueType: "string"}
    grouperClient.syncTable.personSource.primaryKeyColumns = penn_id
    
    # the grouping column is what is uniquely selected, and then batched through to get data.
    # {valueType: "string"}
    grouperClient.syncTable.personSource.groupingColumn = penn_id
    
    # the grouping column is what is uniquely selected, and then batched through to get data, defaults to 10000
    # {valueType: "integer"}
    grouperClient.syncTable.personSource.groupingSize = 10000
    
    # size of jdbc batches
    # {valueType: "integer"}
    grouperClient.syncTable.personSource.batchSize = 50
    
    # column of date or timestamp which is last updated of row
    # {valueType: "string"}
    grouperClient.syncTable.personSource.realTimeLastUpdatedCol = last_updated
    
    # database which holds table to keep status of real time and full sync
    # {valueType: "string"}
    grouperClient.syncTable.personSource.statusDatabase = awsDev
    
    # schema of status table if different than connecting schema
    # {valueType: "string"}
    grouperClient.syncTable.personSource.statusSchema = 
    
    # table of status where real time and full sync status is
    # {valueType: "string"}
    grouperClient.syncTable.personSource.statusTable = grouper_chance_log_consumer
    
    # if doing real time and nightly full sync, this is the hour where full sync should start
    # {valueType: "integer"}
    grouperClient.syncTable.personSource.fullSyncHourStart = 3
    
    # if doing real time and nightly full sync, this is the hour where full sync should have already started
    # {valueType: "integer"}
    grouperClient.syncTable.personSource.fullSyncHourEnd = 4
    
    
  4. Setup an other job to run this

    # Object Type Job class
    # {valueType: "class", mustExtendClass: "edu.internet2.middleware.grouper.app.loader.OtherJobBase", mustImplementInterface: "org.quartz.Job"}
    otherJob.personSourceSync.class = edu.internet2.middleware.grouper.app.tableSync.TableSyncOtherJob
    
    # Object Type Job cron
    # {valueType: "string"}
    otherJob.personSourceSync.quartzCron = 0 * * * * ?
    
    # this is the key in the grouper.client.properties that represents this job
    otherJob.personSourceSync.grouperClientTableSyncConfigKey = personSource
    
    
  • No labels