Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Include Page
spaceKeyGrouper
pageTitleNavigation
Terminology:

This is in Grouper v5+

Table of Contents

Children Display

Terminology

  • "data field" is a user attribute, do not want use attribute since it overlaps with attribute framework

This is a suggestion for how user data could flow to Grouper in future state

The problem this is trying to solve

...

  • .  This is named "data field" since "attribute" is used in many other places, e.g. the attribute framework

Description

  • Data field values are assigned to users, groups, or globally available
  • The data can be single or multi-valued
  • The data can be structured in a row (the data in a cell can still be multi-valued)
  • The data is stored in Grouper, updated in real time and full syncs
  • Point in time history will be maintained
  • Security on data fields will ensure that private data remains private
  • The data will be stored efficiently so it does not take a lot of space and queries are efficient
  • Data fields are documented with examples so users can easily request access, see what data fields represent and how to use them
  • Data fields can be configured in the UI
  • The data can be used:
    • To construct ABAC policies on groups (scripted group based on data field values)
    • Subject sources can be replaced by a data field source (this is the future direction, and all subject sources will eventually need to be migrated)
    • Provisioning data about users to other systems
    • Reports about access and users
    • Etc


Data field flow

...

Gliffy Diagram
macroIdff8e5d9f-b3a2-4633-abe6-480e033df7c1
displayNamedataReimagined
namedataReimagined
pagePin2

Setup entity resolvers

The first configuration step is to set up entity resolvers

For users

  • SQL queries
  • LDAP filters
  • WS calls

Returns

  • Single valued data fields for users
  • Returns multi-valued data fields for users
  • Multivalued rows of data fields for users (e.g. affiliation rows that have affiliation and dept)

Two types of data fields

  • Informational
    • e.g. name, description, email, etc
    • Needed for provisioning or UI or WS
  • Access related
    • e.g. dept, title, school, DN
    • Needed for loading groups, jexl scripted groups, provisioning events

Point in time

  • Grouper can store point in time information about data fields

Assumption

All institutions are either

  • OK with full sync of user data fields on a schedule and thats how up to date they are (e.g. every 30 minutes, hourly, daily)
  • or: Can get events of when data changes in source systems
  • or: Queries to source systems have last updated dates or change logs for real time updates

Grouper gets that data

  • Copies to Grouper database
  • Could process the data a tad
  • "Virtual data fields" can have logic and make a complicated description data field (across multiple resolver sources)
    • Its possible that this could help the problem of having too many subject sources though this isn't intended to be an identity system
  • Can assign security so Grouper knows who is allowed to read which data
    • Each data field could have a group assigned who can see the data
  • There are real time events or timestamps that ensures data is up to date

Subject source

    • Points to Grouper's database
      • Instead being configured against source would be configured against entity resolver data
      • Can use data from multiple sources
    • Note, if entity resolver data is secure and available over UI/WS then the subject doesnt need as many fields... e.g. Penn would not need first name and last name etc.  
    • Subject could just be id
    • People who are allowed to see various entity resolver data fields would see description a certain way, name a certain way, and whatever data fields they can see when they need it
    • Imagine a more detailed subject page for people who can see the data... easier to troubleshoot access

Members table

  • Members table can be stripped down since data is in the entity tables

Loaders

  • Loaders and jexl scripted groups can be written on top of entity data
  • Non admins can securely use that data since Grouper knows who is allowed to see what
  • When the entity resolver knows that data changed real-time, it knows which loader/jexl scripted group to update
  • Not all data about users will be entity resolvers... more than what was in subject source, but not everything
  • If there is peripheral data you can make SQL/LDAP loaders for that
  • Privileges for loaded groups could be loaded with users who can see all the related data fields

UI/WS

  • Imagine more data fields than subject data fields available over WS/UI securely in one query

Provisioning

  • No more "subject link"
  • You can provision any entity resolver data easily
  • When data changes, Grouper can tell a provisioner to recalc a user

Summary

In summary here is a metaphor... we used to have SQL credentials in multiple places, then we made an external system layer to re-use that.  This suggested is similar.  Have a data layer that can we re-used across things.  Includes real-time updates, security, and data manipulation configured centrally...  why?  if we want to be ABAC and data field-based, we need to organize our data fields

Data model

grouper_members

Existing table

  • id (012)
    • subject_id (12345678)

grouper_data_field

Types of data fields for user or rows

  • id (234)
    • system_name (emailAddress)
    • display_name (Email)
    • type (user)
    • cardinality (single-valued)
    • description
    • viewable_by_group_id abc123
  • id (567)
    • system_name (org)
    • display_name (Org)
    • type (row)
    • cardinality (single-valued)
    • description
    • viewable_by_group_id xyz234

grouper_data_row

Type of data field rows available for users

  • id (123)
    • system_name (affiliation)
    • display_name (Affiliation)
    • description

grouper_data_row_field

Which fields are in which rows

  • id (538)
    • grouper_data_row_id (012)
    • grouper_data_field_id (567)

grouper_data_field_row_sec

Row level security for data

  • id (941)
    • grouper_data_field_id (234)
    • group_id_of_result_member pcm428
    • viewable_by_group_id rst567

grouper_member_data_field

Assignment of a data field to an entity

  • id (480)
    • member_id (012)
    • grouper_data_field_id (234)
    • value_id (789)
    • created_on 1/2/3
    • updated_on 2/3/2021

grouper_member_data_row

Assignment of a row of data to an entity

  • id (321)
    • member_id (012)
    • grouper_data_row_id (123)
    • created_on 4/5/2021

grouper_member_data_row_field

Assignment of a field to a row assignment

  • id (637)
    • grouper_member_data_row_id (321)
    • grouper_data_field_id (567)
    • value_id (654)
    • created_on 4/5/2021
    • updated_on 2/3/2021

grouper_dictionary

Keep data field values here to reduce data redundancy

5

  • In this diagram, the green data field resolvers are either cached (e.g. source systems), or not (one off which doesnt need the overhead of PIT etc)
    • A provisioner target could have entity data field values for users

Previous state

Gliffy Diagram
displayNamedataCurrentState
namedataCurrentState
pagePin4

Data field and row diagram

Gliffy Diagram
macroIdd743a01b-5332-48c4-b3a5-4958585a9c9d
namedataFieldExample
pagePin1


Example usages:

  • Provision any identifier to a target without having to "resolve the subject"
  • Make a JEXL scripted group: People who have a payroll data row where org is "MATH" and have an affiliation data row where affiliation name is Staff or Faculty with an End date in the next month.  Put a rule on that group for the Business Analyst to review people who might need to be renewed.
  • Load users from Zoom and match accounts to users in Grouper by any of their email addresses
  • A staff member creates a report where a column represents if the user in the service is an Employee
  • A help desk worker can see the history of affiliations and troubleshoots access by seeing that the user's payroll org recently changed


Configure data field privacy realms

A privacy realm is a configuration for privacy of one or many data fields.  Re-use them as much as you can to reduce the number of configurations

Image Added

Configure data fields

Each data field or row column is configured as a data field.

Image Added

Configure data rows

Rows are configured as a "table".  The columns are data fields.

Image Added

Configure data providers

A data provider is a set of queries that load data into Grouper in real time or full sync

Image Added


Configure data provider queries

These select data from the target to populate Grouper with data field values.  A single provider can have multiple queries.  Each query has one provider.

Image Added

Configure data provider real time query

This helps the change log know which data to update

Image Added

...

  • value (a@b.c)

...