Logistics

  • 09 February 2023
  • Ashish Pandit, University of California at San Diego (convenor)
  • Rupert Berk, University of Washington
  • Paul Prestin, University of Washington
  • Matthew Hailstone, Brigham Young University

Notes

The University of Washington

  • Several big process-transformation business-change programs:
    • 2014 Start implementing Workday HCM
    • 2017 Workday HCM go-live
    • 2018 Start implementing Workday Finance
    • 2020 Integration Guardrails developed
    • 2023 Working Finance go-live expected in July
  • UW is big and decentralized and has multiple stakeholders and multiple internal organizational units (e.g., the medical school) that operate somewhat separately and somewhat different from one another.  This environment requires the introduction of guardrails that outline how-we-do-things-there, and these agreements outline the decision-making  about how specific integrations are delivered.  Note that there are three candidate integration platforms at play here:
  • The flow of data between participating systems is illustrated by the high-level diagram below — some systems integrate directly with Workday whereas others interact with Workday through APIs and events.  Data also flow in and out Workday through the UW Medicine Interoperability Platform (and also into their separate data warehouse).

  • It made sense during the implementation to have some of the data-consumption services running through common pipelines, both to rightsize the load on Workday and to enable business rules to be defined and applied only once.  This gave rise to the Finance Data Repository approach illustrated below.

  • A more-detailed view of the data flowing through pipelines is shown below.  Note that the journals are big, something like four-million rows of data each day moving through there.  Other data entities flow at much smaller volumes and much slower velocity.

  • Workday HCM was the first to go at UW, and there were lessons learned from that experience about the approach taken to integration.  There have been quite a lot of changes in the approach taken to Workday integration over the past five years.  Some of the lessons related to the operational ownership of the platforms, and also the fact that as you run Workday for longer you get a bunch more data to handle!  Out-of-the-box integration approaches (such as scheduled file delivery and some event messaging that Workday supports) are typically those favored by the Workday implementation partners:

  • Note in particular:
    • fixed limit on the four-hour time for integration jobs (and as the size of your Worker dataset grows over time (about 60,000 active workers at any one time at the UW), using file-based integrations start to take a long time to run.
    • early efforts also looked at the event-based messaging in Workday, but that direct-from-Workday feature caused significant performance problems with Workday!
  • Finance is different than HCM!  There are a lot more data in Finance.  To be able to use native Workday integration features such as the file export or the event messaging with what ends up being multi-GB files (those four-million-row journals etc).
  • UW has settled now on a mixture of MuleSoft and custom toolsets (C# and AWS and off-the-shelf ElasticSearch).
  • Choose your toolsets wisely, and expect to need different lenses to see the data ingress and egress from Workday (like the different windows looking into a house).  Note in particular that the Report-as-a-Service can be complex and require replication across multiple Workday tenants, and this can be fraught.

  • Three different toolsets involved, using WQL for the identification and capture of incremental data changes.  Several of the interfaces here are using SOAP (some RESTful also).
  • The off-the-shelf standardized integration patterns from Workday are summarized below, along with some of the features and challenges involved in deploying them.  Workday consultants will usually offer these patterns when they design your integrations!  UW 
  • UW has taken an eventual-consistency approach based upon regular short-interval polling and nightly reconciliation processes.

Brigham Young University

  • BYU has defined a "Business Data Pipeline", featuring a generous level of abstraction above the underlying systems and processes.  Four "lanes" here: data preparation, data producers
  • Data Preparation
  • Data Producers:
  • Data Delivery and Data Consumers:
  • ...and all together in a bigger image:

Discussion

  • ¿ UW is pulling data from APIs and RAAS to populate the UW enterprise data platform: what is BYU planning here? = using Informatica to do some of that heavy-lifting.  BYU is early in its Workday integration, and is interested in looking at more-incremental and lightweight approaches using other methods such as WQL.
  • ¿ UW does nightly reconciliation but is that checked/audited and how often are discrepancies found? = UW takes the approach of doing a full poll out of Workday and compares that with hash values against what's in the UW data store, and if there are differences then an update is required, and the incremental update generally is expected to always work and be correct.  Where discrepancies are found, investigations are undertaken to identify the root cause and fix that for the future.
  • ¿ It's 2023 so what is Workday doing still offering SOAP interfaces! = SOAP is old(er) and more complex but it is also really mature and really capable.  It is harder to hire people who know about SOAP, given that REST is so prevalent these days, but do note that the ecosystem around REST is still not as mature as is the corresponding ecosystem around SOAP.  Workday is moving fairly quickly through the process of moving their interfaces across from SOAP to REST, but things are being lost, like the affordances about 
  • ¿ Reconciliation: when discrepancies are found by the UW reconciliation job does it autonomously cause retriggering of the divergent event? = As much of this as possible is subject to full automation!  At a high level, there are emitters and there are reconcilers and there are updaters and they land in the same pipeline for integration.  The integration team identifies and fixes or arranges the correction of permissions or data-semantics errors.
  • ¿ There is a data store in the UW architecture sitting in front of the data warehouse, and APIs are involved: how does that all fit together? = Workday internally appears to use some kind of MySQL derivative as their master source, and your transactions are ACID at that layer, but you cannot query that, it's all replicated into Elasticsearch, and that's where the APIs happen.  UW has somewhat mirrored that architecture by establishing a data cache using Elasticsearch that feeds the data warehouse.  UW has not (and probably will not) built APIs on the data warehouse, because the use-cases there are different, and it's not intended to play that kind of role.
  • ¿ Is there any formal cache-invalidation strategy applied to UW's Elasticsearch data cache, or is that just left to the incremental data updates from the source system? = It's left to the incremental updates.
  • ¿ BYU has APIs built against a data store and how is the provisioning lag into that data store being managed? = The goal of the data architecture shown above is a business-domain-to-business-domain integration platform that has its business objects populated from the source systems as quickly as possible, and the provisioning is generally sub-second, so APIs will expect to have access to pretty-good data.  In some cases (e.g., those running on Informatica jobs scheduled on an hourly basis) then there will be some stale data.

Further Information

Slide Deck

The UW presentation is available here.

Attendees


  • No labels