You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 28 Next »

Background

As part of a university Identity and Access Management project, Penn State was tasked with implementing a new Central Person Registry, CPR.  The CPR is an intelligent registry for managing person information and represents a fundamental change to business and systems.  The creation and management of person information in one central place will position Penn State to support to the increase of cloud services and the increase of systems for HR, Finance, Outreach and World Campus, Student Systems, College of Med and so forth.  One single individual can be represented in many systems but should only need to change personal information in one place.

Penn State is considering offering this CPR as an open source.  The responses below are solely based on an institutional perspective for managing and governance.  Penn State is interested in licensing and supporting this registry for the OSIdM4HE  - higher education community.  The level of support and type of licensing is still under consideration and up for discussion. 

Response to Questions

IAM Registry questions to evaluate features and functionality against standard business requirements.

Category

Description or Question for solution provider

Response

Link(s) to Documentation

General architecture

Describe how ID match capability is provided by the registry solution. For example, is it (a) an integral part of the solution as provided or (b) must it be integrated with an external ID match engine or (c) can it be provided in some other way?

The Penn State registry solution for matching has two parts, an external engine which generates match codes and an algorithm that is part of the registry, both A and B.  It's flexible to accommodate other solutions.  With regards to the match codes that are generated by the appliance, they take into account variations in name, and address.  So a match code for Bill Smith, William Smith and Billy Smith would be the same thing.  When the matching process is done, an exact match is attempted using either our Penn State Identifier Number (PSU ID Number), Social Security Number or the userid.  If the exact match fails, a near match is done using the match codes for name, date of birth, and address.  The result of which is a ranking of the match between 1 and 550.  For Penn State a match is a score of at least 330.  There are two match algorithms one for domestic and a second for international.  In addition to the identity match, the registry is responsible for cleansing the data when possible.  To support the cleansing of data, an external product was purchased to validate addresses against the USPS.   Products exist to validate addresses for other countries as well.  With the near match logic, the CPR will be able to decrease the number of duplicate records.  All of this does hinge on consistent data collection, so for the CPR we are going to require new person records to have a name, address and partial/full date of birth.  If any part of that information is missing, the data will not be included in matching.  The record could be stored in the registry as a orphan.

Matching Criteria - Standard
Matching Criteria - International 

 

Describe how groups management (for use with authZ controls and other purposes) is provided. For example, is it (a) handled internally by the solution or (b) integrated with an external group management engine such as Grouper or (c) provided in some other way?

The PSU Central Person Registry has integrated Grouper for access management control to the registry.  All Systems of Record (Registration Authorities) are represented as groups.  The Registration Authority Agents are assigned roles with permissions.  

Within the person registry, Grouper is utilized to control authorization to the web services and the data they control.  Registration authorities are represented as a group, which is then assigned a role.  The role is assigned permissions like execute service or update data.  This enables us to remove the authorization from outside the registry.

 

Data model

Describe how the registry solution supports an extensible set of attributes about (a) persons, (b) applications or other external resources, and (c) other, arbitrary entities?

The data model is flexible to support additional person attributes by the addition of new database tables and the establishment of the linkages between the person entities and their new attributes.   Additionally, attributes such as name and address have types.  Types can be added to support specific requirements by various systems of record.

The  current design for the registry is scoped to people.  Entities will be supported in the future either in the existing CPR or in a separate registry appropriately linked to the person registry.

 

AuthZ support

Describe how the registry data model supports defining arbitrary user roles in support of authZ functions.

Roles are an integral part of any Access Management solution.  The CPR will be used to provide information in the construction of roles, however the roles themselves will not live in the registry, as they will reside in an access management solution such as Grouper.

 

Features

Describe how the registry solution supports audit logging of sensitive transactions, including support for the recording of historical changes made to sensitive data. Describe how this log includes the requester and authorizer identities, and transaction timestamps.

The registry has various levels of auditing, the first of which is database logging.  All transactions are logged to a service log table.  In addition, for each database table we maintain a history of changes to records.  Whenever a record is changed, the existing record is marked inactive and a new record is cut.  For each database table we have the following fields that determine our history:

start_date - the timestamp the record was "started".
end_date - will be NULL for active records, otherwise it will be the date the record was made inactive.
last_update_by - contains the identity (service or person) which updated the record.
last_update_on - contains the timestamp of when the record was last updated.
created_by - contains the identity (service or person) which updated the record.
created_on - contains the timestamp of when the record was created.

Remember whenever there is a change a new record in the particular database is cut.  We opted to go this route as opposed to having either a single audit table or multiple audit tables.  The changes for records are contained within the tables themselves.

In addition to database logging, log4j is used as part of all the Java code.

Refer to the data model at the CPR Design Wiki.

 

Describe how the registry solution supports the secure storage of security questions and answers for use in password recovery.

At this time, the data for password security questions and answers are stored in a separate database schema that is outside of our normal registry.  The data can only be accessed via the password reset application using a separate database userid and password.  The Penn State database vendor, Oracle, does provide the facility to encrypt the answers, which we have implemented.

 

 

Is there support for multiple name and address types as well as history?  If yes, please describe.

Yes, the registry does support multiple types of names, addresses, phones, and email addresses.  The types will be   A type is associated with each record stored in our names, addresses, phones and email address types.  So for example, for a Name record it can either be a legal name, preferred name or a documented name.  For a documented name which is obtained from a legal document, we also record the document type (which can be password, driver's license, state identification card or a military identification card).  In addition the types are used as part of our authorization decisions.  We have designed our authorization scheme to allow RAs to only assign particular types of data, if need be.  This authorization is controlled using Grouper. Types could be extended to represent various system of records formatting requirements.

 

Identity Assurance

Are registration events captured as they occur?  Do these events automatically trigger assignment/deassignment of an IAP

Yes, refer to our data model (IAP_Data) for more information about the data that is captured during a registration event.  The data is accumulative, once the user has met the necessary requirements for a particular IAP, it is automatically assigned.  On the flip side, other events like account misuse will trigger a downgrade of the user's IAP.

Identity Assurance Data Model

 

Is there support for real time provisioning of Identities/services

Yes, the person registry supports the notification of provisioning requesting using JMS.  When a service and/or batch process is executed that requires a service/identity to be provisioned, a JMS JSON message is sent to the appropriate provisioner.  The results of the provisioning event are retrieved by a standalone daemon which is then used to update the registry.

 

 

Describe how data is processed (batch, web services)

Data within the person registry can be processed in either batch or web services.  The web services are SOAP-based.  The common core code is isolated in a .jar file that can be shared between the service and batch processes.

 

 

Is registry dependent on other open source or vendor products?  If yes, please provide details.

The registry is built using open source products; Apache Tomcat, Apache CXF, Java, Hibernate, Apache ActiveMQ, JBoss Drools and JAX-WS.  The only commercial product that is used by the registry is the matching appliance, which is isolated by the use of a service.  So any matching solution can be dropped in with minimal changes.

 

 

Where is the business logic stored?  Is there support for delegation to maintain these rules?

Core business logic for the person registry is stored in a rules engine, Drools Expert.  To isolate changes from business logic from rule updates, the rule engine is encapsulated in a service that applications call to process rules.  The benefit to this approach is that applications do not need to be redeployed when the rules are changed.  With regards to maintenance a future release of the person registry will utilize Drools Guvnor for rules maintenance.  Using the Drools Guvnor features and Grouper, the registry will be able to delegate the maintenance of rules.

 

 

How does the registry notify external entities of data changes?  (for example name is changed)

External entities, Service Provisioners, will register their preferences for message receipt with the person registry.  Based on service and/or batch execution if there is a change to a data element they have subscribed to receive, a JMS message is sent to them with the change information.  Currently, we are doing point-to-point JMS messaging, but plan on looking at push/pull in the second quarter of 2012.

 

 

Is code located in public repository

No, not at this time.  A snapshot of the code is available at a shibboleth protected web site.

 

 

How are changes, marketing, etc communicated to public? (wiki, lists, web presence)

Our registry is currently not in production.  Development will be complete on 3/31/2012 and running on production hardware.  Currently communication is all internal to Penn State.

 

 

Is there proper OSS license?

The open source license is still under discussion but will align with those chosen for other HE open source efforts.  Discussions with the IP Office are planned for final approval.

 

 

Is there a clear project lead?

Penn State's IAM project is sponsored by the VP of IT, Kevin Morooney.  He has created an IAM team within Information Technology Services and assigned Renee Shuey as Principal Lead of this effort.

 

 

Is there an existing project steering committee/governance?

Yes.  There is an existing IAM Governance which is sponsored by the Provost and VP of Information Technology.  Plans are to introduce an IT Leadership Council working group to provide steering and develop new policies which will be presented to the executives for approval.

 

  • No labels