In August 2010, I wrote about Workday’s interesting technical architecture, highlights of which included:
- Lots of small Java objects in memory.
- A very simple MySQL backing store (append-only, <10 tables).
- Some modernistic approaches to application navigation.
- A faceted approach to BI.
I caught up with Workday recently, and things have naturally evolved. Most of what we talked about (by my choice) dealt with data management, business intelligence, and the overlap between the two.
It is now reasonable to say that Workday’s servers fall into at least seven tiers, although we talked mainly about five that work together as a kind of giant app/database server amalgamation. The three that do noteworthy data management can be described as:
- In-memory objects and transactions. This is similar to what Workday had before.
- Persistent MySQL. Part of this is similar to what Workday had before. In addition, Workday is now storing certain data in tables in the ordinary relational way.
- In-memory caching and indexing. This has three aspects:
- Indexes for the ordinary relational tables, organized in interesting ways.
- Indexes for Workday’s search-box navigation (as per my original Workday technical post, you can search across objects, task-names, etc.).
- Compressed copies of the Java objects, used to instantiate other servers as needed. The most obvious uses of this are:
- Recovery for the object/transaction tier.
- Launch for the elastic compute tier. (Described below.)
Two other Workday server tiers may be described as:
- Elastic compute. This is used for a few kinds of tasks, such as payroll processing, batch reporting or, I presume, batch ETL.
- Assorted management services. The list CTO Stan Swete sent over included (and I quote verbatim):
- Environment management. What servers are deployed. What is available. Etc…
- Verification services. To verify that related services are in the various tiers are in sync with the state of the persistence layer.
- Credentials services for authentications.
- Management utilities. Scripts to manipulate (create/copy/move/delete) tenants.
- Services to manage unstructured data.
- PCI services for credit card processing.
- Print services.
- Messaging services.
Finally, Workday has a couple of server types or tiers for talking with other systems, namely for user interface and integrations.
Besides data management, the other cool thing we discussed was a type of live report called worklets. The idea is:
- At their heart, Worklets are 2-dimensional reports, with other attributes being drill-down dimensions.
- Worklets take up little screen real estate.
- Thus, worklets are suitable for mobile (tablet) devices, or for embedding in various parts of the application (including otherwise transactional parts).
- A worklet conveys a little bit of information; if you want more, you can pull it from the server.
- Worklets are cached on the user interface server. The total result set in a worklet should be relatively small.
- The two main examples Stan gave me of what a worklet starts with were:
- A few rows from a result set.
- A graphical result (but not the detailed data you might want to drill down to behind it).
Circling back to the five app-server-like tiers, further notes include:
- The elastic compute tier is currently in the same data centers the rest of Workday’s system is. However, in the future it could be on Amazon as an alternative.
- Behind the scenes, reports can run either against tabular data or by traversing the in-memory objects. While some reports are 100-1000x faster on tables, traversing the object graph is in other cases actually more performant. Anyhow, this choice is transparent to users.
- Some data is duplicated between objects and ordinary tables; some is tabular-only.
- The indexes on tabular data are custom to Workday, not native to MySQL. They’re organized in line with the object structure, in a way that sounds somewhat reminiscent of the Akiban hKey.
Besides (or in some cases including) the above, the development team is very concerned with controlling the memory footprint of the in-memory Workday system, and it sounds like improvements over time have literally been at the order(s) of magnitude level. Stan seems to attribute this largely to:
- Choosing the right kind of Java collection for various groups of Java objects.
- Compression, although he didn’t give particulars.
Going forward, Workday hopes to get a further 2X+ reduction via lazy loading driven by object usage stats.
If I were starting a transactional SaaS (Software as a Service) vendor today, I might look at an architecture a lot like Workday’s. In particular:
- Having everything start in a Java object model makes loads of sense. (This is a use case for which Java seems far from obsolete.)
- The dual data model (object and tabular) seems very appealing. Rather than shoehorn data into an uncomfortable model, deal instead with the discomfort of running a couple of (logical) data stores side by side.
- Having multiple server tiers is pretty much a best practice.
However:
- Workday’s extreme wheel-reinvention in the area of database management might not make sense for smaller companies.
- In any given case, the exact choice of tiers might be different from Workday’s. In particular, there might need to be more explicitly analytics-oriented tiers than Workday chooses to split out.