Grouper Call of April 26, 2023

Attending

Chris Hyzer, Penn, Chair
Shilen Patel, Duke
Vivek Sachdiva, independent
Chris Hubing, Internet2
Carey Matt Black, Purdue
Drew Aschenbrener, Internet2
Gabor Eszes, University of Virginia
Kellen Murphy, University of Virginia
Ben Rappleyea, Illinois State U
Gail Lift, University of Michigan
Emily Eisbruch, Internet2

DISCUSSION

Administrivia

Mark your Calendar Internet2 TechEx is Sept 18-22, 2023 in Minneapolis

Update from JJ at Unicon re authentication doc in v5:

Need to update for new version of pac4J
Java 17
Will do 2 levels of doc.
Bare minimum to get going and then more complete doc
Some advanced topics on linked or chained auth
For complex use cases

DDL changes (Chris)

https://spaces.at.internet2.edu/x/NQbvCQ
Problem is there are 2 types of DDL changes
non substantial (adding a view or index)
and
Substantial DDL Changes (eg, add an internal ID to a table that must be populated)
can’t have old Grouper versions updating the database
not forward compatible
DDL update and then an upgrade task, runs every 30 min, does checks
Problem: you start up daemon and other daemons might be running
It’s not as timely as it should be
We have a table in database to store each DDL change
We have DDL table now
Add a table for Grouper DDL change?
recording If database has run that yet and if it’s substantial or not
When Grouper starts, it checks the table, and decides best course
Need upgrade tasks with types, so have DDL upgrade task and synchronous upgrade task
Need logging
Hopefully upgrades will be easier with this strategy
Hard with java to know
Still need to shut down old ones, new ones will come up at same time and collaborate
AI Chris will create wiki for this for v5
Medium priority
ABAC is top priority

Recent Work

Vivek

Working on “start withs” https://spaces.at.internet2.edu/pages/viewpage.action?pageId=219914238
Almost done
Working on SQL “start with” with Chris
Then externalize text
Provisioning screens and Start WIth screens
Labels will be revealed and be sure they are easy to digest, with examples
Worked on JIRA 4720 on Box provisioner
GRP-4720 external system has two 'box' in drop down, do we need to remove old box external system? are they the same?

Worked on GRP-4721
in box provisioning, if not making changes to entities, then "name" attribute should not be required
Worked on GRP-4722
translation continue expression should just assign null, not fail
Chris: hope to focus on Grouper v5 when this work is done

Shilen

Did minor updates to LDAP starts with
User base DN wasn’t being required when selecting from the target
Issue of assuming provisioning subject ID
Did update to diagnostics
Issue with provisioning subject IDs , now resolved
Updated Grouper membership view to not depend on other views
Hard to change views, if you have dependencies on the views,
Now upgrade step where can look at grants, try to do a replace
That view uses membership all view, there have been performance issues
Going forward, important to try to make views that don’t use other views
Make them complex but faster
In v5, stop allowing base DN to be configured as part of LDAP URL
Do not allow users to put base DN in UI
Change external system and track down everything that uses that
Base DN issue impacts loader job, provisioner, subject source
In external system, good to have usage button
Chris added the external system usage to the Grouper roadmap
https://spaces.at.internet2.edu/display/Grouper/Grouper+Product+Roadmap
Shilen: When you go to UI and editing external system, it’s not calling validation, Shilen will fix that from the UI
Another provisioning issue: If LDAP DN as search attribute then “search all” groups or entities was not working correctly
Decided exclude LDAP DN from “search all”
Assume another search attribute
Shilen: there was request around USDU and max unresolvables, Shilen will look at this, Chris assigned a JIRA

Chris

Working on v5
Object model, tables, beans, logic
Next generation DDL structure
https://spaces.at.internet2.edu/x/8pwbDQ
8 byte integer id for certain objects, isn’t exposed,
Not available for web service.
It’s for foreign keys
Add to grouper groups and grouper fields
Table Structure
First table is SQL cache group table, has group internal ID
There will be a lot of querying
Need fast updates
Reduce number of foreign keys to make it quicker
Table will hold unique tuples
One row represents membership list
Has enabled on and disabled on date
Membership table is lightweight
Has flattened add timestamp
Important info is when you are added to the group
Multiple inserts in one call, hope to populate quickly
Membership PIT table is also flattened
Rows for every path you are added to a group
Link to field and group, link to member
Does not overlap with current point in time
Don’t need field for existing members
Note:
Only existing groups can be cached
Once they are deleted and no internal ID, cache goes away
Policies with ABAC : does not make sense for deleted groups and members in cached PIT
Naming issue: Point in time is not point in time
It’s a cached point in time
This is retention record for PIT
It is a cache of historical memberships
Need to document this
Call it Cached History, not point in time

If group that’s a member of another group gets deleted
History gets deleted, but effective groups not deleted are in PIT tables
Will hold all the sources, will hold groups in groups
All effective memberships, not immediate membership
When you make a group cacheable you need to get point in time
Do efficient queries on PIT tables
If something goes thru changelottemp to changelog, we know the date it was removed. There is an end time on the record
When you make a group cacheable you must get all PIT records
Make sure everything is right
Need to think of other edge cases
Attribute is used to make a group cacheable, thanks to Matt’s suggestion
Don’t need full history instantly available; can wait til next full sync
Tables get updated from multiple places
Including changelog temp to changelog
Should there be a separate changelog consumer?
Changelogtemp to changelog has all the necessary info
Threads created from changelogtemp to changelog to handle bigger operations
Yes, have threads and they join at the end
If adding a big group to another cacheable group, it’s going to the thread
Full sync would go through all cacheable groups, all data in the tables, remove duplicates, see the sizes of groups, make sure membership are correct, make sure point in time is correct
If full sync daemon makes changes, or just sends messages of things it needs to look at
Full sync would not block
Full sync should not find a lot of things to do, things should be correct
Another solution: Make change, check it, keep looping to reduce work changelogtemp to changelog has
Fewer duplicates if only changelogtemp is making changes
If full sync makes changes, pauses, checks, that should work
Matt: concerned about the UI
Don’t want surprising user experience
Bad if UI hangs or behave in expected way
Good to push backend work to backend, makes it easier to scale
Going back to how to make something cacheable
A couple of ways
One way: a group that Grouper uses a lot, like sysadmin group
Grouper makes it cacheable
Another way: if you use it in any ABAC script Grouper marks it as cacheable
If you manually add that attribute
Grouper needs to know how the attribute was added
If you remove the group from all ABAC scripts you might want it to still be cacheable
Need metadata ?
Suggestion that once cacheable , always cacheable
Could have web services call slow down because Grouper decided to remove cacheable, and you would not know why
Good discussion
AI Chris Hyzer - update the Membership cache tables wiki https://spaces.at.internet2.edu/x/xIlQDw
Takeaways:
- Look for cacheable groups
- Change scripted groups to use SQL

Chris met with community members doing provisioning and having issues w subject sources

Chad

Posted on core slack, if you want an external system not well integrated w Grouper, how to send messages on membership changes?
There’s a rule to send an email
“Working with a customer, we're trying to work out a gap in what they need. Grouper is running in AWS, the logging goes to Azure Sentinel. They want a watch on certain sensitive groups when the membership changes. There are membership rules that can send email, but that's too analog for them. A custom log message as a RuleThen type would be a good option. We could write a hook or changelog consumer, but they don't want custom programming. There are WS apis to pull the audits, but I don't see where you can set multiple groups in the GET call, despite what the wiki says. So it would mean multiple WS calls per group which doesn't scale. Is adding new RuleThen behavior the best option for this gap?”
Little scripts, GSH custom code would help
Use web hook messages?
Web services interface, HTTP messaging, several apps support this for inbound
Call a web hook
Send a message to teams
Can be done w Jenkins
Email but not email
Changelog consumer in GSH, not in JAVA
Rules are synchronous, this is asynchronous
Technical debt in the java code
AI Chad - create JIRA on send messages on membership changes and Changelog consumer in GSH, not in Java

Issue Roundup

Jiras in past two weeks

GRP-4696
Loader jobs summary page shows count -1 if there are any subject problems

GRP-4695
Visualization "Unable to retrieve..." errors shouldn't dump a whole stacktrace

Grouper wiki updates in past two weeks

Grouper Emails in past two weeks

none

Next Grouper Call: Wed May 10, 2023

Page tree

26-April-2023

Grouper Call of April 26, 2023

DISCUSSION