spaces.at.internet2.edu has been upgraded to Confluence 6.12.2. If you have any questions and/or concerns, please contact us at collaboration-support@internet2.edu
Child pages
  • Duo Security Outage - Responses and Planning for Future
Skip to end of metadata
Go to start of metadata

The following page offers advice for planning and handling outages to the Duo Security service.

Monitoring Duo

  1. Duo offers a specific status page, https://status.duo.com/ with outage information, and is a good place to start.
    1. what's this – ? https://urldefense.proofpoint.com/v2/url?u=https-3A__status.duo.com_&d=CwIBaQ&c=y2w-uYmhgFWijp_IQN0DhA&r=lNKHpzGJV6eDhS9ywGFIM09rFusUQguE4Cr_8enAJAA&m=5w6F0y2jFRsDyUkRcpardKxYhYdFAvpe77QS5Jo9M5E&s=gYvqUnFWUuCjNqyK1zEqiwQW3S9XTmKm15su3l2EFDw&e=
  2.  Duo API host monitoring is prudent

    We learned today that simple ping monitoring of the Duo API host is
    insufficient. Based on today's incident, it appears that you should at
    least perform an HTTP GET on some API host resource and alert on slow
    responses and/or error status codes.
    1. I didn’t see any data in the thread on how to actually do the monitoring, so I asked Duo in a support ticket and they recommended a particular link on the API host to pull:
      https://$APIHOST/auth/v2/ping

      where APIHOST is your specific hostname which would look something like:
      api-12345678.duosecurity.com

      This pulls a little JSON packet like so:
      {"response": {"time": 1454437771}, "stat": "OK"}

      The specific recommendation was "The best course of action for monitoring your API hostname to discover when an issue may occur would be to set up a heartbeat ping to your specific API hostname. This can be utilized with a script to automatically email if packets are consistently dropped or if a connection is unable to be made.”

      I double-checked with them if this would be superior to the simple ping of the host, and they said yes it does exercise the application.
  3. Different DUO services – Duo iFrame (Web SDK integration) VS use of the API; which endpoint to monitor depending on how a site is using DUO

Bypassing Duo

Choosing an approach – fail-open VS fail-closed – implications of choosing each approach

It's a classic risk/cost balance.  The right answer depends on tolerance for risk (in terms of less-secure authentication, as well as loss of the authentication service itself) and what price you're willing to pay.

those applications that must remain protected (HIPPA, FISMA Moderate, in their opinion) remain protected during an incident.  However, the much larger (generally speaking) user base of self service and less secure application can continue to operate in an event.     This provides a middle of the road solution to protect that which “must” be protected and allow those with lower risk profiles to continue to operate. 

 Fail-open becomes more defensible as a steady state if the IdP accurately reports the authentication mechanism used.  Applications would be able to drop specific assertions or access requests above and beyond the protests of the IAM system, but users that had opted in to two factor wouldn’t be locked out of everything.

A few solutions were offered to support a fail-open integration, to allow AuthN to continue in a weakened state:

  1. (need cookbooks for both CAS and Shib)
  2. It seems like the 'toggle' is something that Warren Curry, Brett Bieber and Rhian Resnick have a really good way of doing on a per-user basis, based on a live/replicated data source, that preserves authentication but can change authN context when needed, based on a service outage.
  3. Configure IdP to check group membership before prompting for Duo, and remove users from the group to bypass.
    1. Nebraska uses a CAS Duo Extension configured to check for a specific attribute value memberOf: cn=psp:orgs:idm:DuoEnabled,ou=grouper,ou=group,dc=unl,dc=edu
  4. ...

Communicating AuthN results to SPs

I.e., When the IdP is authenticating in bypass/fail-open mode, what should be sent to SPs indicating that the Authentication process that took place didn't include MFA?

There are two basic scenarios in which this question might be asked:.

SP requests/validates and IdP communicates MFA success

In this case, the SP explicitly requests MFA or (or the equivalent) through the requested authentication context(s). The IdP then communicates the fact of Duo (or any other MF) authentication success through a separate AuthN context.

The IdP must not (per SAML standards) assert Duo/MFA success if authentication is done via "fail-open", so if MFA fails, the SP authentication process will fail ("cannot authenticate using MFA") or be modified ("authenticated with password only"). Note that using approved alternatives to a primary MFA mode is not necessarily the same as "fail-open". E.g., if Duo push fails, but Duo authentication can be performed with other mechanisms (texting a code, use of a one-time code, etc.), it would still be acceptable for the IdP to indicate MFA success.

SPs explicitly requesting/consuming MFA contexts therefore need to either accept downtime when MFA systems fail or must be configured to allow for password-only authentication under appropriate circumstances. (E.g., the SP could be easily reconfigurable to allow for "Password Protected Transport" logins in the event of an MFA failure event).

IdP locally (and silently) enforces MFA

In this case the IdP uses local criteria (not based on specific requests from the SP) to decide whether to authenticate the user using MFA (e.g., flags on the user object in the IDM system). Typically in this case the IdP does not communicate the fact of MFA to the SP, instead indicating simply success of a "Password Protected Transport" login.

Here the SP is not able to (or presumably interested in trying to) determine whether MFA actually occurred as part of the authentication event, and is relying on the IdP to make the appropriate "authentication strength" decision. In this case the IdP can indicate authentication success if "fail-open" occurs, presuming that this is consistent with the IdP's normal operating practices. Pragmatically, if SPs are relying on IdP-managed and -enforced MFA support for increased security it is advisable to document MFA failure behavior (of defaulting to "fail-open" or "fail-closed") to ensure that the SP operators are aware of the impacts. That is, if the SP operator knows the IdP operator is enforcing MFA, but the fact of MFA is not communicated explicitly in the SAML assertion to the SP, then whether or not "fail-open" success is "acceptable" for user authentication is a business/SLA form of decision that cannot be inspected as part of the SAML/authentication conversation.

If the IdP supports "fail-open" operation, then SPs that do not wish to have authentication success in a "fail-open" mode would need to be addressed on a case-by-case basis by the IdP operator or, preferably, be reconfigured to explicitly request MFA support (as described above) so that the SP can make the determination of whether MFA was successful and take the appropriate action in response.

 

 

  • No labels

3 Comments

  1. I'm doing some testing to exercise the ping API endpoint, and have found some surprising results. I had a placeholder value, api-XXXXXX.duosecurity.com, in my script and discovered that pings sent to that hostname return an OK result. I'm quite sure that API host doesn't exist, and therefore is absolutely not exercising the thing I care about, my Duo API host. I am very suspect of what the ping endpoint is doing; consequently we are still evaluating monitoring approaches. Also it should be noted that any probes that exercised the Duo APIs would not have detected the outage that precipitated this discussion.

    1. Nick Lewis (internet2.edu) - it might be worth checking with the Duo contacts to see if they can tell you what exactly the ping endpoint is doing.  If you can share that here, that would be good.

  2. Nick Roy (internet2.edu) Duo responded to the duo-users list:

    https://lists.incommon.org/sympa/arc/duo-users/2016-02/msg00041.html


    Hi Marvin,


    You've discovered a neat side-effect of some of the routing technology that exists inside our cloud. The /ping endpoint you referenced doesn’t require authentication to send a response. Without the /ping check authenticating to your specific API hostname (api-XXXXXXXX.duosecurity.com), any name checked against can respond to the request.


    As you have pointed out here, Duo can respond to /ping checks for API hostnames, whether they explicitly exist in our cloud infrastructure or not. There are billions of possible combinations for API hostnames based on the 8 unique characters which constitute each DNS entry in our service. That said, not all queries will respond properly if you were to check against them with random DNS values.


    Making a /ping health check request with your specific API hostname (as displayed in your Duo Admin Panel) will always check against the same servers processing your users’ authentications.


    Because we know that /ping is used for automated monitoring in many of our customer’s environments, it is not subject to the same stringent policies such as ‘anomaly detection’ that might be triggered when using other endpoints, such as /auth or /preauth (which are called for user authentications).


    If you have other concerns about monitoring or availability, please reach out to our support team at " style="text-decoration:none"> support@duosecurity.com. We’d love to hear about your specific monitoring needs as we continue to build out product.


    Lastly, I encourage all of you to subscribe to our Status Page. We update it whenever we are experiencing service-level issues and it shows the service status for all of our system deployments. You can find your Deployment ID in your Duo Admin Panel on the left-side pane below the Settings tab.


    Thank You! Trevor Mays
    Duo Support