Sunday, August 8, 2010

Cloud Integration Architecture

OK - So I've arrived at my one-year anniversary with SpringSource. And what a year it has been ... I have drank my fill from the firehose and now am prepared to start blogging again about what I have learned about enterprise integration, especially as it relates to private cloud deployment. I must clarify right up front that the views within this blog are not necessarily the views of SpringSource or VMware, but that my views are likely biased by the fact that I work for this company. In this blog, I will focus on why the needs of enteprise integration in the cloud are radically different from traditional physical deployments.

In physical, traditional enterprise integration deployments are based on the "centralized server" model. This deployment model all started with the data - and big Relational Database Management Systems (RDBMS) for centralized management thereof. We then extended that centralized server concept to the application tier with big JEE application servers for centralized management of the applications that act on the data. Finally, we created the concept of the big Business Process Management Server / Enterprise Services Bus - centralized servers for managing the enterprise integration tier.

The cloud (whether it be private, public, or hybrid) allows for massively parallel, horizontal scale-out architectures. This is radically different from the centralized, vertical scale-up architectures that have evolved from the original success of big-RDBMS on physical hardware. So it has always been, and will always be ... the new cloud integration deployment model must start with the data.

There are two data persistence requirements that are fairly unique to integration architectures:
  1. High-Write - The primary reasoning for having integration as a separate tier is to avoid the loss of any in-flight messages. Ensuring this as a message is validated, transformed, enriched, and routed within the integration tier means a lot of writes to an underlying persistence store. Since messages are typically passed by value in most integration architectures today, this also means very little reads.
  2. Transient Data - The actual message data is really only of interest to the integration tier while a message is in-flight. After the message processing is complete, message data is no longer of use to the integration application. Sure - integration solutions do typically provide tracking of messages for historical purposes - but auditing is a tangential concern to the integration tier.
High-write applications do not scale well horizontally on today's RDBMS solutions. This is because most database servers on the market today scale-out primarily to handle highly concurrent reads of data. Writes of data are still serialized through a single master instance that can only be scaled up to handle higher load.

The transient data requirement makes you wonder even if a traditional database is the right solution for an integration tier - since traditional databases are meant for long-term, static storage of data. Certainly, a RDBMS makes sense for auditing of historical processing within the integration tier, but not necessarily for real-time online transaction processing of in-flight message data.

So - what other persistence store can handle the high-writes of transient data and scale out effectively to take better advantage of cloud deployment architecture? The answer - an in-memory distributed data cache (a data fabric). It is with this argument that I firmly believe highly-distributed cloud enterprise integration solutions must be based on a data fabric capabale of high-write transactions within multiple data partitions, providing high availability, and configurable synchronous / asynchronous persistence to disk. Persistence to underlying RDBMS for long-term historical auditing purposes can be done through an asynchronous write-behind mechanism that clears those completed transactions from the cache on a scheduled basis.

Moving up the stack from the data tier into the integration tier, we now must take a close look at traditional Message Oriented Middleware (MOM) solutions. Those of you have who have an "I love M.O.M." tattoo on your arms should probably not read further. M.O.M. evolved not from Eve, but from traditional Enterprise Application Integration solutions (EAI) that were a fad back in the late 90's. These EAI solutions were, as you would expect, highly centralized hub-and-spoke architectural approaches to enterprise integration. After Y2K, the ISV industry dusted these off, re-labeled them, and began to sell them to you for twice the price.

Enterprise Service Bus and Business Process Management middleware are server-based approaches to middleware based on JMS. JMS, like JEE is a specification, and there is a "J" at the beginning of it for a reason. Vendors build server support for that specification and then compete on features that go beyond the specification to lock you in. Eventually, 2 or 3 competitors get the same new features into there servers, so they come together and agree on a specification ... and the cycle renews itself.

AMQP is an open internet protocol - designed to be asynchronous (unlike IIOP) and reliable (unlike SMTP). Open internet protocols are proven to outlast the lifetime of the average software company (i.e., HTTP). AMQP is an open standard, natively supported by Java, .NET (WCF), Python, C, Perl, and Ruby among others.

If you've been building your enterprise integration solutions on Spring Integration, as I've blogged about in the past - then you are in a great place ... as Spring Integration gives your application a portable abstraction over the transport layer called the channel. Making the switch from JMS to AMQP is simply a configuration change to the channel - with no affect on your application. Also, if you've been moving away from ESB/BPM server-centric architectures towards a highly-distributed event-driven architecture as I've blogged about in the past, then you will be able to truly take advantage of the horizontal scale that ubiquitous messaging with AMQP gives you. Think of it as "twitter-style" application-to-application messaging for the business.

Moving further up the stack from the integration tier into the application tier, we now must take a close look at traditional JEE application servers. Those of you in application development who have been developing on the Spring Framework for years will already agree that you hardly make use of any of the runtime features of a full-stack JEE server. Spring released you from EJB's and gave you the portability you desired without the high cost against your creativity or productivity.

So why do we as an industry still hold on to the JEE application server when we know that our application developers don't really use it? The answer has to do with Reliability, Availability, Scalability, and Performance. The full-stack JEE application server gives java developers RASP on physical hardware. RASP is not an easy thing to make simple, and folks who know how to tune a specific application server for RASP are hard to find, harder to recruit, and almost impossible to pay enough to keep for very long. Just like Oracle DBA's or folks that know the inner-workings of Websphere Message Broker, these folks command the highest salaries because they have dedicated their professional careers to learning all the buttons, levers, and switches that need to be set when tuning an application server for RASP.

The way to free yourself from all of this is to look for a different approach to RASP ... one that is consistent across all tiers of your application infrastructure, and one that is well known across many operations folks you already have running that infrastructure on a daily basis. The answer, of course, is virtualization. Virtualization is a proxy that provides RASP to application infrastructure through the very same Inversion of Control pattern that you've already come to love about the Spring Framework. Virtualization is capable of providing RASP to a database, a message broker, or an application server in a consistent and predictable manner that is well-known to a large percentage of your operations staff. VMware's virtualization technology bases it's approach to RASP on it's VMotion capability - which allows for virtual machines to be quickly moved from one physical host to another either due to outage of the physical host, or even just spikes in application load, without having to take those virtual machines down.

I hope I have convinced you that it is time to "re-think" the server. Cloud Integration Architecture requires tearing down the monolithic server-bound architectures we've spent the past 20 years building: database servers, middleware servers, JEE application servers. The cloud is now the server and solid application architecture again takes its rightful position as king. Re-commit yourself to the craft of software engineering and you will embrace the future of IT.

1 comment:

James Williams said...

You raise some interesting points about cloud and integration. I wonder how RASP specialists will function in this new cloud era? Does the need for app server, database and OS tuning experts evaporate or will the cloud make tuning and specialization even more important?