You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 19 Next »

Terminology:

  • "data field" is a user attribute, do not want use attribute since it overlaps with attribute framework

This is a suggestion for how user data could flow to Grouper in future state

The problem this is trying to solve

  • User data in subjects, provisioners, and loaders solve similar problems
  • Real time data solved in multiple ways
  • Security of who is allowed to see what (by person aka row or data field aka value)
  • Efficiency of being able to query data without reaching out to other resources
  • Ability to use data from multiple sources at one time
  • Reduce the number of network calls in various places
  • Reduce the SQL and LDAP syncs required to make things work
  • Troubleshooting access is difficult when the history of data field changes is not known
  • Unresolvable subjects are a pain... history of data fields of users will help


dataReimagined

Setup entity resolvers

The first configuration step is to set up entity resolvers

For users

  • SQL queries
  • LDAP filters
  • WS calls

Returns

  • Single valued data fields for users
  • Returns multi-valued data fields for users
  • Multivalued rows of data fields for users (e.g. affiliation rows that have affiliation and dept)

Two types of data fields

  • Informational
    • e.g. name, description, email, etc
    • Needed for provisioning or UI or WS
  • Access related
    • e.g. dept, title, school, DN
    • Needed for loading groups, jexl scripted groups, provisioning events

Point in time

  • Grouper can store point in time information about data fields

Assumption

All institutions are either

  • OK with full sync of user data fields on a schedule and thats how up to date they are (e.g. every 30 minutes, hourly, daily)
  • or: Can get events of when data changes in source systems
  • or: Queries to source systems have last updated dates or change logs for real time updates

Grouper gets that data

  • Copies to Grouper database
  • Could process the data a tad
  • "Virtual data fields" can have logic and make a complicated description data field (across multiple resolver sources)
    • Its possible that this could help the problem of having too many subject sources though this isn't intended to be an identity system
  • Can assign security so Grouper knows who is allowed to read which data
    • Each data field could have a group assigned who can see the data
  • There are real time events or timestamps that ensures data is up to date


Subject source

  • Points to Grouper's database
    • Instead being configured against sources would be configured against entity resolver data
    • Can use data from multiple sources
    • All identifiers must be unique
  • Note, if entity resolver data is secure and available over UI/WS then the subject doesnt need any fields... e.g. Penn would not need first name and last name etc in the subject configuration.
  • Subject is really just a collection of prioritized identifiers (e.g. employeeId is highest priority) and attributes
  • People who are allowed to see various entity resolver data fields would see description a certain way, name a certain way, and whatever data fields they can see when they need it
  • Imagine a more detailed subject page for people who can see the data... easier to troubleshoot access
  • If an employee ID does change (and no other conflicts), the user could be resolved by other identifiers and it might "just work"

uuid, idIndex, subjectType (group/person/app/thing), search strings, sort strings, resolvable, etcGrouper_members_identifiers
grouper_member_idIndex, subject_identifier (unique)


When data fields are referenced, also a two part process.  If a group (and user allowed to see), go to group table(s), if anything other than a group, then its the data field tables

Loaders

  • Loaders and jexl scripted groups can be written on top of entity data
  • Non admins can securely use that data since Grouper knows who is allowed to see what
  • When the entity resolver knows that data changed real-time, it knows which loader/jexl scripted group to update
  • Not all data about users will be entity resolvers... more than what was in subject source, but not everything
  • If there is peripheral data you can make SQL/LDAP loaders for that
  • Privileges for loaded groups could be loaded with users who can see all the related data fields

UI/WS

  • Imagine more data fields than subject data fields available over WS/UI securely in one query

Provisioning

  • No more "subject link"
  • You can provision any entity resolver data easily
  • When data changes, Grouper can tell a provisioner to recalc a user

Summary

In summary here is a metaphor... we used to have SQL credentials in multiple places, then we made an external system layer to re-use that.  This suggested is similar.  Have a data layer that can we re-used across things.  Includes real-time updates, security, and data manipulation configured centrally...  why?  if we want to be ABAC and data field-based, we need to organize our data fields

Data model

grouper_members

Existing table can be stripped down since data is in the entity tables

  • id (012)
    • subject_id (12345678)
    • idIndex
    • subjectType (group / person / app / thing)
    • search strings
    • sort strings
    • resolvable

grouper_members_identifiers

Make sure unique identifiers.

When subjects are looked up, it can be a two part process (instead of N-part for N subject sources).

  1. Look at groups in group table,
  2. Look at entities (including GrouperSystem, users, apps, things) in the data_field tables based on data fields that are marked as identifiers
  • id (737)
    • member_id (012)
    • subject_identifier (12345678)

grouper_data_field

Types of data fields for user or rows

  • id (234)
    • system_name (emailAddress)
    • display_name (Email)
    • type (user)
    • cardinality (single-valued)
    • description
    • viewable_by_group_id abc123
  • id (567)
    • system_name (org)
    • display_name (Org)
    • type (row)
    • cardinality (single-valued)
    • description
    • viewable_by_group_id xyz234

grouper_data_row

Type of data field rows available for users

  • id (123)
    • system_name (affiliation)
    • display_name (Affiliation)
    • description

grouper_data_row_field

Which fields are in which rows

  • id (538)
    • grouper_data_row_id (012)
    • grouper_data_field_id (567)

grouper_data_field_row_sec

Row level security for data

  • id (941)
    • grouper_data_field_id (234)
    • group_id_of_result_member pcm428
    • viewable_by_group_id rst567

grouper_data_member_field

Assignment of a data field to an entity.   When data is synced to the data field tables it will need to do some matching and assign a new grouper_members row if existing not found

  • id (480)
    • member_id (012)
    • grouper_data_field_id (234)
    • value_id (789)
    • created_on 1/2/3
    • updated_on 2/3/2021

grouper_data_member_row

Assignment of a row of data to an entity

  • id (321)
    • member_id (012)
    • grouper_data_row_id (123)
    • created_on 4/5/2021

grouper_data_member_row_field

Assignment of a field to a row assignment

  • id (637)
    • grouper_member_data_row_id (321)
    • grouper_data_field_id (567)
    • value_id (654)
    • created_on 4/5/2021
    • updated_on 2/3/2021

grouper_dictionary

Keep data field values here to reduce data redundancy

  • id (789)
    • value (a@b.c)
  • id (654)
    • value (math)

grouper_data_field_sec_group

List of security groups for data field columns and rows

Could be who is allowed to see a column, who is allowed to see a rowGroup, or who is in the row

  • id
    • group_id_index


grouper_data_field_sec_group_mem_cache

Cache these memberships so lookups are fast.  Cache this in memory too for long running processes

  • id
    • sec_group_id
    • mem_id_index

grouper_data_field_sec_data_field

  • id
    • grouper_data_field_id (567)
    • grouper_data_field_sec_group_id

grouper_data_field_row_pop_group

  • id
    • data_field_sec_group_id_of_row
    • data_field_sec_group_id_can_see


  • No labels