Consensus - Identifiers

Deprecate SAML PersistentID and eduPersonTargetedID

There are a number of disadvantages to continuing to promote the use of SAML's built-in pairwise identifier and many of them also apply to the concept of pairwise identifiers in general, leading the working group to conclude that a different approach is warranted. Some of the problems are inherent to any use of the SAML NameID construct to carry personally-identifiable information:

Combining the NameID with the use of SAML Attributes is inherently confusing because of the separation of the constructs and the need to understand and configure both mechanisms.
There are few accepted standards for describing NameID formats, and many implementations handle them improperly without recognizing the Format or properly handling the NameQualifier attributes, leading to interoperability problems.
The NameID is used in LogoutRequest messages, which are frequently passed via redirect. This results in the NameID value ending up inside HTTP logs in a decodable way unless XML Encryption or the POST binding is used. The former requires support for XML Encryption in a direction oppposite the usual one, from SP to IdP. This is poorly supported by SP implementations and typically will require that new keys be deployed at every IdP and published in metadata. It simply adds complexity.

The specific formulation of the SAML pairwise identifier has a number of problems, one of which is simply fatal: It was defined to be case-sensitive, which allows issuers to supply identifiers for different users that differ only by case. This literal requirement is not met by a variety of, if not the majority of, common web applications. Even though e-mail address itself is not defined to be case-insensitive, in practice it's treated that way by applications, many of which assume all identifiers should be handled that way. While there are techniques to fix SAML implementations such that even generated identifiers are not case-sensitive (e.g., using Base32), existing deployments would have to rekey users.

If that weren't enough, the SAML formulation has other challenges, such as its comparatively large size and the use of a "triple" containing SAML entityIDs to namespace-qualify the value, which is at odds with the much more common use of a simple domain suffix found in every other identifier. And while it was defined by eduPerson to be usable as a SAML Attribute, its complex XML syntax is unsupported by all but a few open source implementations.

Finally, there are problems with pair-wise identifiers themselves. For many deployments in research and academia, they create a barrier to cross-application correlation of identity and even a desire to be non-anonymous by users. Often these deployments have no choice but to resort to proxy systems that consume a single pairwise identifier on behalf of many applications. The mechanism in SAML to directly support this use case without proxies has simply not been adopted by federations. It's also unclear that such identifiers provide the legal protections under EU privacy law that many have used to argue for their use.

Taken as a whole, the working group believes these are compelling arguments for a change in strategy as we pursue this profiling work. We may find that the concept of pairwise identifiers is too important to lose, but if so, we need a different way to communicate it that breaks with current practice.

for saml2int

-No more use of subject identifiers at the assertion level other than transients (for logout)

-No use of encrypted identifiers, if you want to support logout, you would use transient ID

-Only use attributes to carry identifiers in the assertion

-We don't expect support for non-string-based attributes any longer (precluding eduPersonTargetedID)

-SAML persistent ID cannot be used safely because it's case-sensitive, unworkable in many COTS applications

for R&E federations

-Use ePUID

-We NEED to codify/ratify the caseIgnoreMatch status, we probably need to further profile this to exclude Unicode scopes

-Use non-reassigned ePPN

THAT'S IT

Pairwise IDs offer ZERO legal protection under EU privacy law (this is not known to be true, but it's falseness is also unknown to be true), and come at a HUGE cost to deployers of IdPs and collaborative / research SPs.

SAML made an assumption that you'd just compare identifier strings, and it turns out that apps / implementations do not do this, they end up converting to all upper/all lower ('normalizing' case) and thus its existing solution for pairwise identifiers is dangerous and should be replaced if we still want pairwise ID.

Identifier Properties Consensus

Non-reassignable: YES
Persistent (for some reasonable timeframe): YES
Human friendly: NOT important
Name based: NOT important
Correlatable (not-targeted): IMPORTANT
Targeted/pair-wise (non-correlatable): IMPORTANT
Domain-scoped: TBD

NOTE: A single identifier cannot meet the needs of both #5 and #6 simultaneously.

Page tree

Consensus - Identifiers

Deprecate SAML PersistentID and eduPersonTargetedID

for saml2int

for R&E federations

Identifier Properties Consensus