Child pages
  • Problem Statements
Skip to end of metadata
Go to start of metadata

Application Scenarios / Types

Migrating Email-Based Registration into Federation

Many applications that start life by supporting user self-registration by e-mail will allow for only a single user identifier for records. That identifier is overloaded for the following purposes:

  • treated as an email address, with reduced functionality (of varying impact) if e-mail to it bounces
  • used as a displayable value to the owning user
  • used as a value that will be used to find other users and to assign/track permissions

It will be assumed to change occasionally, but not constantly. Reassignment will be treated as unfortunate, but an occasional issue that has to be manually worked around, usually by treating the resource(s) as of sufficiently low value to make the risk acceptable.

Confluence and Jira

The Confluence and Jira products are popular "targets" for federation, and are good representations of a class of services we might call "user-driven collaboration and resource sharing".

These specific products support three discrete fields for user identification, a username, an email address, and a "full name". This is a common pattern, although many such cases may combine the username and email address concepts.

The username is the key to the user's profile and records in the applications (including access control lists) and is not easily changeable to something else. Links to the user's "personal" space include the username.

If a "full name" is not provided, the username doubles as a displayed label for the user when the user edits content and in the directory of users visible to other users. There are no options to limit display of other users (to support FERPA, for example).

The separate email address field prevents misuse of the primary identifier field as an email address. It is also shown when browsing or selecting users (e.g., for rights assignment).

Internally- or Dual-Keyed Applications

A "best case" scenario for federation is one in which the application either specifically treats user identifiers as internal, non-displayed information, or relies on a pair of identifiers, one of which must have the property of non-reassignment.

An application that relies solely on internal use of identifiers may have problems when provisioning is taken into account (see below), but can otherwise function well with a directed or at least opaque identifier that a user would not see or know. This makes satisfying requirements for stability and non-reassignment easier on an IdP. Applications that care about preventing the risk of non-reassignment may have to limit themselves to accepting attributes that guarantee this, or assess the use of other attributes on a per-IdP basis.

When privacy is not a requirement or goal, additional data such as email address and/or legal name attributes may be required along with the identifier.

Applications that can handle at least two user keys may be able to offer better provisioning and resource sharing behavior by leveraging an opaque identifier as the underlying/permanent key, and a displayable/friendlier identifier as a user-interface supplement. A change or reassignment of the non-opaque identifier would be detectable when the opaque identifier stays the same or changes, respectively, allowing immediate update of records without user intervention.

Applications with "Lifetime" Identification

An application wants the ability to identify and authenticate users regardless of their "current" affiliation with an educational institution. Students graduate and may lose their accounts, and employees and faculty move on from their original employers. Identifiers that are issued and managed by a particular institution may be "orphaned" at that point, or even worse, eventually reassigned to somebody else.

This problem is most acute when an "enterprise IdP" is involved, but is still a problem with other identifier sources. The problem of relying on ISP-issued e-mail addresses is quite well known, but people do move on from Google and Facebook too, even if perhaps less often. Less popular sites might even disappear entirely.

The application wants a "portable" identifier (if there truly is such a thing), or needs to have a mechanism to securely (enough) migrate a user's identifier from one value to another as their options for authentication and IdP-dictated identifiers change. E-mail workflows may be a common way of changing identifiers for a user, but the security of that mechanism is obviously somewhat limited.

Applications from some problem domains may have natural affinities for identifiers that come from natural authorities in those problem domains. For example, the Social Security Administration and the IRS obviously view SSN as the identifier of choice. It will often be true that use of such identifiers outside of their natural/intended scope will be frowned upon, but it will often be tempting.

Common Identifiers Supported by R&E IdPs

eduPersonPrincipalName

  • Very broad support.
  • Typically human readable/friendly, but not required to be.
  • "Scoped", meaning the right-hand side can be assessed for "appropriateness" based on the IdP asserting it.
  • Often change occasionally when name-based.
  • About 50% of sites reassign them to other persons after some period of time, the period varying from months to years.
  • Usually known to a user to some degree, but often not known by other users, particularly those at other organizations.

mail

  • Not universally supported for federation use, but broadly supported.
  • Typically human readable/friendly, but not required to be.
  • Inconsistently populated in terms of the source of authority. Some IdPs provide a value controlled by the organization, and some provide a value entirely controlled by the user.
  • Not formally "scoped", meaning that IdPs are meant to be free to assert email addresses from any domain, not just their own, and they often do (see previous point about authority).
  • Often change occasionally when name-based.
  • Some sites reassign them to other persons after some period of time, the period varying from months to years.
  • Many sites support multiple addresses per user and the attribute is multi-valued.
  • Usually known to a user and to other users.

eduPersonTargetedID / SAML 2.0 "persistent" NameID

  • Not (yet?) commonly supported in the US, widely supported in the EU.
  • Not human readable/friendly and tend to be very long (strictly speaking, they can be huge, though generally not that big).
  • Usually different per-service, but this is ultimately up to the IdP.
  • Not "scoped" in the DNS manner, but are explicitly namespace-qualified by the IdP by definition.
  • Are meant to change rarely, if ever, but this can vary by IdP.
  • May not be reassigned to different users, ever, without violating the semantics of the attribute (this is the one absolute).
  • Completely unknown to a user and to other users.

Local Campus IDs (Employee Numbers, Student Numbers)

  • Not commonly encouraged for federated use, but some usage as "employeeNumber" and other locally-defined attributes.
  • Usually not human readable/friendly.
  • Generally not unique unless combined with some qualifier, such as the IdP.
  • Are meant to change rarely, but this can vary by IdP.
  • Rarely reassigned to other users, but this is entirely up to the IdP and hasn't been studied.
  • May raise privacy concerns (unwisely, since keeping SSNs "secret" is what made them ripe for misuse).
  • Usually known to a user to some degree, but often not known by other users, particularly those at other organizations.

Provisioning Notes

Across most scenarios, there are a set of underlying provisioning models that are impacted by the kinds of identifiers used. In analyzing a problem statement's "fit", understanding the provisioning model required will be critical.

Four common models for provisioning:

  • batch exchanges
  • just-in-time
  • creation by other application users
  • invitation/introduction

With batch models, opaque identifiers are very workable since the knowledge of identifiers is left to the source of the batch feed. Very strong identity linkage is possible with this model.

Just-in-time provisioning requires that an application need not know about a user until the user logs into the application, which is often a limiting assumption. But the identifier itself can be anything, since it comes directly from the source, and need not be known to users. Very strong identity linkage is possible with this model.

Many application UIs rely on the ability of one user to assign rights, group memberships, and similar properties to a user that may or may not already have logged into an application. Assuming the answer is not "sorry, you have to wait for them to login", then the choice of identifier is usually constrained to something friendly, and often will need to be an email address, or mismatches will occur. Alternatively some form of integrated identity mapping/searching functionality would need to be available (this is sometimes called a "people picker"). Of course, in isolated cases, some other kind of identifier that might be domain specific could be known to the user community of the application. Identity linkage in this model can be a concern because users may mistakenly enter the wrong identifier, and expose resources to a user other than the one intended before the mistake is found.

Finally, the invitation model is one in which users do nothing but identify other users by email address, causing a message to be sent to that address with a code that binds the identity intended to the reader of the email. This is a kind of asynchronous SSO transaction that combines JIT provisioning with an email/code as a placeholder. This model allows for use of opaque identifiers, but identity linkage is based on email and the security of that system.

  • No labels