About Pipelines

Pipelines connect data from External Identity Sources to Person Records. Pipelines can be used to automatically enroll, update, and expire Person (Role) records linked to external sources.

Pipeline Flow

Data flows through a Pipeline in a specific sequence.

  • Authoritative data is held at the System of Record ("SoR"), which may be a formal system such as a campus ERP, or an informal system such as a spreadsheet. The System of Record exchanges data via a SQL or LDAP database, a flat file, an API, or another similar mechanism.
  • The External Identity Source is an instantiated External Identity Source Plugin. It obtains data from the System of Record and converts it to Registry's internal format.
  • The converted record is stored as an External Identity. Registry also stores an artifact known as an External Identity Source Record, which contains metadata about the System of Record data, as well as a copy of the original (unnormalized) record.
  • Person and Person Role attributes are created and updated based on the External Identity attributes.

Attribute Management

The various attributes incorporated at different parts of the Pipeline have different characteristics:

AttributesDescriptionOwnershipFormatUpdate viaProvisionable?
System of RecordSoR's original dataSoRSoRSoRNo
External IdentitySoR data, normalized to Registry formatSoRRegistrySoRNo
PersonRegistry operational copy of SoR dataRegistryRegistrySoR or RegistryYes

External Identity Roles and Person Roles

The Pipeline will automatically create a Person Role for each External Identity Role mapped from the System of Record data. If the External Identity Role is deleted, the Role Status On Delete configuration (see below) will be applied.

Handling System of Record Role Data

Note that some Systems of Record are not capable of reflecting multiple Roles for a single External Identity record. The behavior varies by SoR implementation, but some systems may assign a unique SOR ID to each Role, even though they refer to the same individual. In this situation, each Role will look like a different External Identity to Registry, and by default will result in multiple People being created. There are a few approaches to handling this, including:

  • Preprocess the System of Record data so that by the time it reaches Registry via the External Identity Source Plugin interface it is correctly structured as a single External Identity with multiple External Identity Roles.
  • Inject a unique (per-person) Identifier into the SoR data, and use the Pipeline Identifier Match Strategy to link the records together.
  • Use the Pipeline External Match Strategy to leverage an external matching system (such as Match) to link the records together.

Details of each approach will vary according to the upstream integration, and so are beyond the scope of this documentation.

Frozen Attributes

Systems of Records sometimes can't or won't change data for business or legal reasons, but their data does not reflect how a Person is in practice affiliated with an organization. For example, an individual's official title might be Research Consultant IV, but day to day that individual is referred to as Principal Investigator. Publishing the Research Consultant IV title may not just be bothersome to the individual, but could also impede others trying to engage.

To address this class of problems, Person attributes created from an External Identity Source can be edited and then Frozen. Re-syncing the External Identity will not update the frozen attribute, unless and until the attribute is unfrozen.

Freezing a Person Role only freezes the values that are directly a part of that Person Role, such as title. Related attributes, such as Telephone Number, must be separately frozen.

Freezing a Person Role will prevent automatic status recalculation (which is not specific to Pipelines).

Attributes can be flagged as Frozen via the REST API, but can currently only be unfrozen via the user interface.

Configuration

Match Strategy

Before a Pipeline can do anything else, it needs to identify a Person record to operate on, or to create a new one. The mechanism by which this is specified is the Match Strategy. The following Match Strategies are supported:

  • Email Address: If the inbound record has an Email Address of the configured Type, the Pipeline will search for an existing Person with the same Email Address of the same Type. The Pipeline will perform a case-insensitive search, and only against verified Email Addresses.
  • Identifier: If the inbound record has an Identifier of the configured Type, the Pipeline will search for an existing Person with the same Identifier of the same Type. The Pipeline will perform a case-sensitive search, and only against active Identifiers.
  • External: Not yet implemented (CFM-375)
  • No Matching: No matching is performed. Every new System of Record entry will generate a new Person.

(warning) For both Email Address and Identifier matching, if more than one matching Person is found, it is non-deterministic as to which record will be matched. In general, these strategies should only be used when the matching attributes are unique across People within the CO.

When the Pipeline searches for a matching Person record, the overall Person status is ignored (AR-Pipeline-2). That is, the new External Identity may link to a Person in a status other than Active, including Suspended or Duplicate. This may result in a recalculation of the Person status.

Sync Configuration

Various Pipeline settings control how data sync is performed.

Role Status On Delete

When an External Identity Role is deleted from the underlying System of Record data, the corresponding Person Role will be set to the specified status.

(warning) The exact behavior of Role deletion depends on how the Plugin in use handles various situations. For an overview, see Handling Inactive Roles.

Sync to COU

If set, any new Person Role created from the External Identity Source connected to the Pipeline will be placed in the configured COU. Currently, if multiple External Identity Roles create multiple Person Roles, all will be assigned to the same COU.

Replace Record in COU

XXX

Person Role Affiliation

If set, any Person Role created from the External Identity Source connected to the Pipeline will be given the configured affiliation, otherwise the affiliation provided by the External Identity Source backend will be used.

Sync Identifier Type

When External Identity Sources specify Managers and/or Sponsors within an External Identity Role record, these relationships must be specified using an Identifier, since the External Identity Source does not know Registry's internal Person ID. This configuration determines which Identifier Type is used to map these relationships. If this Type is not configured, Manager and Sponsor relationships will not be synced to the Person Role record.

Pipeline Status Management

Pipelines will update Person Status in accordance with Person and External Identity Status.

Of particular note, when an External Identity Source deletes an External Identity Role, the associated Person Role will be set as configured by Role Status on Delete (see above).

Syncing Relationships

It is possible to map Manager and Sponsor relationships from External Identity Roles to Person Roles via Pipelines. To enable this capability, set Sync Identifier Type in the Pipeline configuration to the Identifier Type that will be used to identify the Manager and Sponsors in the External Identity Source record.

The External Identity Source plugin must populate manager_identifier and/or sponsor_identifier with a value of the configured Identifier Type. When syncing an External Identity Role, if the Pipeline can find an existing Person record with the specified Identifier of the configured Type, it will insert a link to that Person ID in the new or updated Person Role Record. If no matching Person record is found, a note will be made in the logs but the Pipeline will continue processing the rest of the record and the relevant field in the Person Role will be left blank. If multiple Person records are found matching the specified Identifier, the first Person returned by the database will be used.

There is currently no requirement that the Manager or Sponsor be in active status.

Manually Rerunning a Pipeline

XXX

Manually resyncing the record will force the Pipeline to run, even if there were no changes to the source record.

See Also

Changes From Earlier Versions

As of Registry v5.0.0

  • Pipelines connect External Identity Sources to People, not Organizational Identities to CO People.
  • Sync on Add/Update/Delete are no longer configurable.
  • Pipelines will automatically create Person Roles for any External Identity Role provided found, Create CO Person Role Record is no longer supported.
  • Sync Identifier Type must be configured to a specific type in order to enable Syncing Relationships. A blank value may no longer be used to match any type of Identifier.
  • If a Match Strategy finds more than one Person, the result is non-deterministic. Previously, an error was thrown.
  • No labels