Grouper Call of April 26, 2023

Attending 

  • Chris Hyzer, Penn, Chair
  • Shilen Patel, Duke
  • Vivek Sachdiva, independent
  • Chris Hubing, Internet2
  • Carey Matt Black, Purdue
  • Drew Aschenbrener, Internet2
  • Gabor Eszes, University of Virginia
  • Kellen Murphy, University of Virginia
  • Ben Rappleyea, Illinois State U
  • Gail Lift, University of Michigan
  • Emily Eisbruch, Internet2

DISCUSSION

Administrivia

Mark your Calendar  Internet2 TechEx is Sept 18-22, 2023 in Minneapolis

Update from JJ at Unicon re authentication doc in v5: 

  • Need to update for new version of pac4J
  • Java 17
  • Will do 2 levels of doc.
  • Bare minimum to get going and then more complete doc
  • Some advanced topics on linked or chained auth
  • For complex use cases


DDL changes (Chris)

  • https://spaces.at.internet2.edu/x/NQbvCQ
  • Problem is there are 2 types  of DDL changes
  • non substantial (adding a view or index)
    and
  • Substantial DDL Changes (eg, add an internal ID to a table that must be populated)
  •    can’t have  old Grouper versions updating the database
  •     not forward compatible 
  •     DDL update and then an upgrade task, runs every 30 min, does checks
  •     Problem: you start up daemon and other daemons might be running
  •    It’s not as timely as it should be
  •    We have a table in database to store each DDL change
  •    We have DDL table now
  •    Add a table for Grouper DDL change?
  •         recording If database has run that yet and if it’s substantial or not
  •        When Grouper starts, it checks the table, and decides best course
  •  Need upgrade tasks with types, so have DDL upgrade task and synchronous upgrade task
  • Need logging   
  • Hopefully upgrades will be easier with this strategy
  • Hard with java to know 
  • Still need to shut down old ones, new ones will come up at same time and collaborate
  • AI Chris will create wiki for this for v5
  • Medium priority
  • ABAC is top priority


Recent Work


Vivek



Shilen

  • Did minor updates to LDAP starts with
  •    User base DN wasn’t being required when selecting from the target
  •    Issue of assuming provisioning subject ID


  • Did update to diagnostics
  •    Issue with provisioning subject IDs , now resolved
  • Updated Grouper membership view to not depend on other views
  • Hard to change views, if you have dependencies on the views, 
  •    Now upgrade step where can look at grants, try to do a replace
  • That view uses membership all view, there have been performance issues
  • Going forward, important to try to make views that don’t use other views
  • Make them complex but faster
  • In v5, stop allowing base DN to be configured as part of LDAP URL
  • Do not allow users to put base DN in UI
  • Change external system and track down everything that uses that
  • Base DN issue impacts loader job, provisioner, subject source
  •  In external system, good to have usage button 
  • Chris added the external system usage to the Grouper roadmap 
  • https://spaces.at.internet2.edu/display/Grouper/Grouper+Product+Roadmap
  • Shilen: When you go to UI and editing external system, it’s not calling validation, Shilen will fix that from the UI

  • Another provisioning issue: If LDAP DN as search attribute then “search all” groups or entities was not working correctly
  • Decided exclude LDAP DN from “search all”
  • Assume another search attribute
  • Shilen: there was request around USDU and max unresolvables, Shilen will look at this, Chris assigned a JIRA



Chris

  • Working on v5
  • Object model, tables, beans, logic
  • Next generation DDL structure  
  • https://spaces.at.internet2.edu/x/8pwbDQ
  • 8 byte integer id for certain objects, isn’t exposed,
  • Not available for web service. 
  • It’s for foreign keys
  • Add to grouper groups and grouper fields
  • Table Structure
  • First table is SQL cache group table, has group internal ID
  • There will be a lot of querying
  • Need fast updates
  • Reduce number of foreign keys to make it quicker
  • Table will hold unique tuples
  • One row represents membership list
  • Has enabled on and disabled on date
  • Membership table is lightweight
  • Has flattened add timestamp
  • Important info is when you are added to the group
  • Multiple inserts in one call, hope to populate quickly
  • Membership PIT table is also flattened
  • Rows for every path you are added to a group
  • Link to field and group, link to member
  • Does not overlap with current point in time
  • Don’t need field for existing members
  • Note: 
  • Only existing groups can be cached
  • Once they are deleted and no internal ID, cache goes away
  • Policies with ABAC : does not make sense for deleted groups and members in cached PIT

  • Naming issue: Point in time is not point in time  
  • It’s a cached point in time 
  • This is retention record for PIT
  • It is a cache of historical memberships 
  • Need to document this
  • Call it Cached History, not point in time


  • If group that’s a member of another group gets deleted
  • History gets deleted, but effective groups not deleted are in PIT tables 
  • Will hold all the sources, will hold groups in groups
  • All effective memberships, not immediate membership
  • When you make a group cacheable you need to get point in time 
  • Do efficient queries on PIT tables
  • If something goes thru changelottemp to changelog, we know the date it was removed. There is an end time on the record
  • When you make a group cacheable you must get all PIT records
  • Make sure everything is right
  • Need to think of other edge cases

  • Attribute is used to make a group cacheable, thanks to Matt’s suggestion
  • Don’t need full history instantly available; can wait til next full sync
  • Tables get updated from multiple places
  •     Including changelog temp to changelog  
  •    Should there be a separate changelog consumer? 
  •    Changelogtemp to changelog has all the necessary info
  •    Threads created from changelogtemp to changelog to handle bigger operations
  •    Yes, have threads and they join at the end
  •    If adding a big group to another cacheable group, it’s going to the thread
  •  Full sync would go through all cacheable groups, all data in the tables, remove duplicates, see the sizes of groups, make sure membership are correct, make sure point in time is correct
  • If full sync daemon makes changes, or just sends messages of things it needs to look at
  • Full sync would not block
  • Full sync should not find a lot of things to do, things should be correct
  • Another solution: Make change, check it, keep looping to reduce work changelogtemp to changelog has
  • Fewer duplicates if only changelogtemp is making changes
  • If full sync makes changes, pauses, checks, that should work

  • Matt: concerned about the UI
  • Don’t want surprising user experience
  • Bad if UI hangs or behave in expected way
  • Good to push backend work to backend, makes it easier to scale


  • Going back to how to make something cacheable
  • A couple of ways
  • One way: a group that Grouper uses a lot, like sysadmin group
  • Grouper makes it cacheable
  • Another way: if you use it in any ABAC script Grouper marks it  as cacheable
  • If you manually add that attribute
  • Grouper needs to know how the attribute was added
  • If you remove the group from all ABAC scripts you might want it to still be cacheable
  • Need metadata ? 
  • Suggestion that once cacheable , always cacheable
  • Could have web services call slow down because Grouper decided to remove cacheable, and you would not know why
  • Good discussion
  • AI Chris Hyzer -  update the Membership cache tables wiki https://spaces.at.internet2.edu/x/xIlQDw
  • Takeaways:
    • Look for cacheable groups
    • Change scripted groups to use SQL 


Chris met with community members  doing provisioning and having issues w subject sources

Chad

  • Posted on core slack, if you want an external system not well  integrated w Grouper, how to send messages on membership changes?  
  • There’s a rule to send an email
  • “Working with a customer, we're trying to work out a gap in what they need. Grouper is running in AWS, the logging goes to Azure Sentinel. They want a watch on certain sensitive groups when the membership changes. There are membership rules that can send email, but that's too analog for them. A custom log message as a RuleThen type would be a good option. We could write a hook or changelog consumer, but they don't want custom programming. There are WS apis to pull the audits, but I don't see where you can set multiple groups in the GET call, despite what the wiki says. So it would mean multiple WS calls per group which doesn't scale. Is adding new RuleThen behavior the best option for this gap?”
  • Little scripts, GSH custom code would help 
  • Use web hook messages?
  • Web services interface, HTTP messaging, several apps support this for inbound
  • Call a web hook
  • Send a message  to teams
  • Can be done w Jenkins
  • Email but not email
  • Changelog consumer in GSH, not in JAVA
  • Rules are synchronous, this is asynchronous
  • Technical debt in the java code
  • AI Chad - create JIRA on send messages on membership changes and Changelog consumer in GSH, not in Java


Issue Roundup 


Jiras in past two weeks



GRP-4696
Loader jobs summary page shows count -1 if there are any subject problems


Grouper wiki updates in past two weeks


Grouper Emails in past two weeks

  none


Next Grouper Call:  Wed May 10, 2023



  • No labels