Sunday, August 7, 2011

Cloud Integration Architecture: The complementary roles of Data Distribution and Application Eventing

When discussing the importance of a distributed data fabric in modern application architecture, I often have to explain the new role of application messaging and how it complements data fabric within the context of Cloud Integration Architecture.

Message Oriented Middleware has been largely misused in the past as a workaround for distributing large amounts of data within the enterprise due to the lack of partitioning support within many standard RDBMS offerings. This is why expensive and complex centralized distributed transaction coordination has sometimes found its way in to enterprise application designs – to solve artificial problems that stove-piped relational database and message oriented middleware products have created. The end result of all this has been a higher-degree of coupling at the application tier, not only between the application and its underlying infrastructure, but also between applications themselves – as most of these types of solution implementations use point-to-point messaging in their designs. Message Bus and Business Process Management products evolved to loosen the coupling of these solutions, but still require applications to share data and operate in a unified manner in response to a set of common business requests. These solutions, like the relational databases they compliment, are implemented as centralized servers requiring shared storage for high availability and limiting architects to vertical-only scalability models which are not optimized for cloud-style deployment.

A distributed data fabric, such as VMware vFabric Gemfire, supports the partitioning and replication of big data by combining database and messaging semantics.

The data fabric supports ACID-style consistency and high availability through the automated management of redundant copies of data partitions across multiple local servers. Redundant local data copies are synchronized in a parallel fashion so that it doesn't cost the application architect in terms of latency to create higher levels of availability for their distributed solution. When a local server is lost, one of the redundant copies takes over as the new primary for its data and redundancy SLA's are re-established across the fabric. This means that in order to have an availability issue within a data center, if the redundancy SLA is set to n copies, n + 2 servers would have to be lost simultaneously. It also means that the solution can easily horizontally scale within the data center by adding / removing servers from the local fabric dynamically to serve more (or less) application clients.

A data fabric also supports eventual consistency and further high availability through the automated management of redundant copies of data partitions across multiple data centers over a wide area network. Redundant distributed data copies are asynchronously maintained, allowing for updates to be batched before being sent over the WAN - optimizing the use of this more expensive network resource. A distributed data fabric allows for data to be globally consistent within tenths of seconds / seconds, as opposed to tens of minutes / hours with traditional log shipping solutions.

Each server within the distributed data fabric uses “shared nothing” parallel disk persistence to manage both its primary and redundant data. Reads are then served by all copies, while writes are served only by the primary. The built-in messaging queues underlying the WAN distribution mechanism of the data fabric are also managed by the same redundancy SLA and backed by the same shared nothing parallel disk persistence. In this way, architects no longer need to use either shared storage nor distributed transactions to support the effective management of data underneath distributed applications optimized for cloud deployment.

So what does this all mean for application messaging?

The future of application messaging is founded in event driven architecture. Distributed application components publish events asynchronously to a message broker solution as they are processing data. Those same distributed application components can also voluntarily subscribe to the message broker solution in order to consume messages they are interested in for further processing.

Modern message brokers, such as VMware vFabric RabbitMQ, are designed to handle very high throughput employing similar horizontal scalability and availability characteristics as their complimentary data fabric solutions. All messages are published to exchanges (or topics) which are shared across multiple brokers. All messages are consumed from queues which are local to a specific broker. New brokers can be added / removed to / from the cluster to serve more / less application clients. Brokers are backed by persistence to local disk - eliminating the need for shared storage.

Since the distributed application components also share a distributed data fabric, the business events being shared at the application messaging tier don't need to contain all of the data in the model. In fact, modern application frameworks such as Spring Integration, support the Claim Check pattern for this very reason. The Claim Check pattern allows an architect to persist a complex object model to a shared data store before the message is sent. The shared data store returns a claim check, or unique id, by which the data can be retrieved if/when needed. In this way, the message payload for the event need only contain the claim check for the data.

With a distributed data fabric underneath the application tier, architects are now free to use application eventing ubiquitously within an application architecture. No longer must we obsess over the proper level of granularity across our distributed application components, because modern application frameworks, such as Spring Integration, support an abstract concept of the channel used to communicate between those application components. It is only a matter of external configuration to change my application components from collaborating locally within a single process to communicate remotely to multiple distributed processes - nothing in my application code itself is aware of this change.

Looking ahead to the not-so-distant future, it will be possible for cloud application platforms to manage the distribution of an application in real-time, in direct response to load. Under low-load conditions an application may be configured to run all within one process. Cloud application Platform as a Service (aPaaS) solutions, such as VMware Cloud Foundry, can already dynamically scale individual processes in response to real-time load characteristics. With the support of the Control Bus pattern by modern application frameworks going forward, aPaaS solutions will also be able to automatically distribute applications across multiple processes as well as scale those multiple processes independently of each other.

Sunday, August 8, 2010

Cloud Integration Architecture

OK - So I've arrived at my one-year anniversary with SpringSource. And what a year it has been ... I have drank my fill from the firehose and now am prepared to start blogging again about what I have learned about enterprise integration, especially as it relates to private cloud deployment. I must clarify right up front that the views within this blog are not necessarily the views of SpringSource or VMware, but that my views are likely biased by the fact that I work for this company. In this blog, I will focus on why the needs of enteprise integration in the cloud are radically different from traditional physical deployments.

In physical, traditional enterprise integration deployments are based on the "centralized server" model. This deployment model all started with the data - and big Relational Database Management Systems (RDBMS) for centralized management thereof. We then extended that centralized server concept to the application tier with big JEE application servers for centralized management of the applications that act on the data. Finally, we created the concept of the big Business Process Management Server / Enterprise Services Bus - centralized servers for managing the enterprise integration tier.

The cloud (whether it be private, public, or hybrid) allows for massively parallel, horizontal scale-out architectures. This is radically different from the centralized, vertical scale-up architectures that have evolved from the original success of big-RDBMS on physical hardware. So it has always been, and will always be ... the new cloud integration deployment model must start with the data.

There are two data persistence requirements that are fairly unique to integration architectures:
  1. High-Write - The primary reasoning for having integration as a separate tier is to avoid the loss of any in-flight messages. Ensuring this as a message is validated, transformed, enriched, and routed within the integration tier means a lot of writes to an underlying persistence store. Since messages are typically passed by value in most integration architectures today, this also means very little reads.
  2. Transient Data - The actual message data is really only of interest to the integration tier while a message is in-flight. After the message processing is complete, message data is no longer of use to the integration application. Sure - integration solutions do typically provide tracking of messages for historical purposes - but auditing is a tangential concern to the integration tier.
High-write applications do not scale well horizontally on today's RDBMS solutions. This is because most database servers on the market today scale-out primarily to handle highly concurrent reads of data. Writes of data are still serialized through a single master instance that can only be scaled up to handle higher load.

The transient data requirement makes you wonder even if a traditional database is the right solution for an integration tier - since traditional databases are meant for long-term, static storage of data. Certainly, a RDBMS makes sense for auditing of historical processing within the integration tier, but not necessarily for real-time online transaction processing of in-flight message data.

So - what other persistence store can handle the high-writes of transient data and scale out effectively to take better advantage of cloud deployment architecture? The answer - an in-memory distributed data cache (a data fabric). It is with this argument that I firmly believe highly-distributed cloud enterprise integration solutions must be based on a data fabric capabale of high-write transactions within multiple data partitions, providing high availability, and configurable synchronous / asynchronous persistence to disk. Persistence to underlying RDBMS for long-term historical auditing purposes can be done through an asynchronous write-behind mechanism that clears those completed transactions from the cache on a scheduled basis.

Moving up the stack from the data tier into the integration tier, we now must take a close look at traditional Message Oriented Middleware (MOM) solutions. Those of you have who have an "I love M.O.M." tattoo on your arms should probably not read further. M.O.M. evolved not from Eve, but from traditional Enterprise Application Integration solutions (EAI) that were a fad back in the late 90's. These EAI solutions were, as you would expect, highly centralized hub-and-spoke architectural approaches to enterprise integration. After Y2K, the ISV industry dusted these off, re-labeled them, and began to sell them to you for twice the price.

Enterprise Service Bus and Business Process Management middleware are server-based approaches to middleware based on JMS. JMS, like JEE is a specification, and there is a "J" at the beginning of it for a reason. Vendors build server support for that specification and then compete on features that go beyond the specification to lock you in. Eventually, 2 or 3 competitors get the same new features into there servers, so they come together and agree on a specification ... and the cycle renews itself.

AMQP is an open internet protocol - designed to be asynchronous (unlike IIOP) and reliable (unlike SMTP). Open internet protocols are proven to outlast the lifetime of the average software company (i.e., HTTP). AMQP is an open standard, natively supported by Java, .NET (WCF), Python, C, Perl, and Ruby among others.

If you've been building your enterprise integration solutions on Spring Integration, as I've blogged about in the past - then you are in a great place ... as Spring Integration gives your application a portable abstraction over the transport layer called the channel. Making the switch from JMS to AMQP is simply a configuration change to the channel - with no affect on your application. Also, if you've been moving away from ESB/BPM server-centric architectures towards a highly-distributed event-driven architecture as I've blogged about in the past, then you will be able to truly take advantage of the horizontal scale that ubiquitous messaging with AMQP gives you. Think of it as "twitter-style" application-to-application messaging for the business.

Moving further up the stack from the integration tier into the application tier, we now must take a close look at traditional JEE application servers. Those of you in application development who have been developing on the Spring Framework for years will already agree that you hardly make use of any of the runtime features of a full-stack JEE server. Spring released you from EJB's and gave you the portability you desired without the high cost against your creativity or productivity.

So why do we as an industry still hold on to the JEE application server when we know that our application developers don't really use it? The answer has to do with Reliability, Availability, Scalability, and Performance. The full-stack JEE application server gives java developers RASP on physical hardware. RASP is not an easy thing to make simple, and folks who know how to tune a specific application server for RASP are hard to find, harder to recruit, and almost impossible to pay enough to keep for very long. Just like Oracle DBA's or folks that know the inner-workings of Websphere Message Broker, these folks command the highest salaries because they have dedicated their professional careers to learning all the buttons, levers, and switches that need to be set when tuning an application server for RASP.

The way to free yourself from all of this is to look for a different approach to RASP ... one that is consistent across all tiers of your application infrastructure, and one that is well known across many operations folks you already have running that infrastructure on a daily basis. The answer, of course, is virtualization. Virtualization is a proxy that provides RASP to application infrastructure through the very same Inversion of Control pattern that you've already come to love about the Spring Framework. Virtualization is capable of providing RASP to a database, a message broker, or an application server in a consistent and predictable manner that is well-known to a large percentage of your operations staff. VMware's virtualization technology bases it's approach to RASP on it's VMotion capability - which allows for virtual machines to be quickly moved from one physical host to another either due to outage of the physical host, or even just spikes in application load, without having to take those virtual machines down.

I hope I have convinced you that it is time to "re-think" the server. Cloud Integration Architecture requires tearing down the monolithic server-bound architectures we've spent the past 20 years building: database servers, middleware servers, JEE application servers. The cloud is now the server and solid application architecture again takes its rightful position as king. Re-commit yourself to the craft of software engineering and you will embrace the future of IT.

Saturday, June 20, 2009

Agile SOA - Part III

This is the third and final blog in this series. In Part I, I opened with the thought that the real challenge of SOA is changing the way you look at the IT investment of your business to be more from the perspective of software architecture. In Part II, I discussed that the best approach to software architecture is Agile due to the evolutionary nature of software. In this final installment I'd like to close with how an Agile approach to software architecture - based on thinking of SOA in terms of composite design patterns implemented as part of your application architecture through the use of embedded frameworks will help you better realize the benefit of an Agile approach.

In the quest for enterprise architecture discipline, many of us are re-creating our futures out of our past by modeling it through a static, black-box approach. If your business' enterprise architecture can only be described in a power-point slide deck, using boxes and connecting lines - then you haven't taken the time to truly understand the collective learning from the previous generation of hardware-centric architects - and are doomed to make the same mistakes they made. SOA makes behavior a first-class design concern, and pushes static representations of architecture to a "behind-the-scenes" supportive role.

A business' enterprise architecture is a living organism, much like the business it supports, and at it's core should be a software architecture - modeled through a dynamic, white-box approach. We should be striving to describe this software architecture through a means closer to a linear, episodic series of video clips. Each episode should demonstrate an encapsulated portion of the system in a white-box manner - demonstrating the scripted pattern in which these components interact (think of a football play being viewed from an overhead camera with primary and optional flows for the different players on the field). Episodes should be linked together in a linear-fashion through the use of rule-based pathing (think of a football game as a composition of individual plays - where the individual plays are woven together with an over-arching strategy that capitalizes on both an individual team's strengths and an opposing team's weaknesses). It is in this way that a business strategy can best be decomposed into a set of composite processes that capitalize on the competitive advantage of a business within an industry full of competitors.

Many of you who are reading this may think that I am just another advocate of a Business Process Management (BPM) middleware-centric approach to SOA. In fact, I am not. Using my analogies above, I believe the Process Manager pattern is a good fit for demonstrating the scripted interaction of local components. I emphasize the word local here because I do not believe Process Manager is a good architectural pattern for interactions of distributed components. Hohpe and Woolf describe the Process Manager pattern as follows in their Enterprise Integration Patterns book: "Using a Process Manager results in a so-called hub and spoke pattern of message flow." Similar to database architecture, the hub and spoke architecture of a traditional process manager will not scale well horizontally. Horizontal scalability is key to distributed computing, especially within today's cloud hardware architectures.

If you still think that the Process Manager pattern is best for managing the scripted behavior of local components, then you have to take a long, hard look at the BPM middleware at the core of most commercial-off-the-shelf (COTS) vendor SOA suites. BPM middleware today is built around the Web Services Business Process Execution Language (WS-BPEL) standard - which describes a standard way in which many, distributed Web Services can be orchestrated within the context of a single business process. If the Process Manager pattern should not be used for distributed components, then the link between Web Services and today's BPM middleware should come in to question. If the Web Services standard is removed from BPM middleware, then the whole WS-BPEL standard fails as a foundation for implementing the Process Manager pattern, and COTS BPM products in-turn fail as a foundational component of your SOA infrastructure.

As an alternative to BPM middleware, I prefer a lightweight solution, such as the one described in the following InfoQ article by my colleague and friend, Oleg Zhurakousky: Workflow Orchestration Using Spring AOP and AspectJ. This article demonstrates how to build and orchestrate highly configurable and extensible, yet light-weight embedded process flows within your software architecture using Aspect Oriented Programming (AOP) techniques. Oleg's use of Spring AOP, a proxy-based AOP mechanism, to address functional concerns such as instrumenting a process with activities while using Aspect, a byte-code weaving AOP mechanism, to address non-functional concerns such as activity navigation and transition is, in my opinion, the perfect way to implement the scripted interaction of local business logic components within the Process Manager pattern.

So if the Process Manager pattern is not the right foundational pattern for your distributed SOA, then what is? I believe the right foundational pattern for your SOA is the Staged Event Driven Architecture (SEDA) pattern first published by Matt Welsh, et. al., from the Computer Science Division of the University of California, Berkeley in 2001. In SEDA, applications consist of a network of event-driven stages connected by explicit queues. Using my previous football analogy, think of the network as the whole football game, composing a number of event-driven stages which represent the individual play. The network architecture underlying SEDA is intrinsically better at supporting the horizontal scalability demanded by distributed SOA computing.

Within the context of SEDA, a stage is a self-contained application component consisting of an event handler, an incoming event queue, and a thread pool. As described above, that self-contained application component could itself be a composite scripted interaction of local application components. The introduction of a queue between stages decouples the execution of application components by introducing an explicit control boundary - which allows for their distribution. The SEDA model constrains the execution of a thread to a given stage, as a thread may only pass data across the control boundary by enqueuing an event. Introducing a queue between two modules of an application provides isolation, modularity, and independent load management. Because stages interact through an event-dispatch protocol instead of a traditional API, it is straightforward to interpose proxy stages between components for rules-based pathing, performance profiling, and/or testing & debugging.

One of the best implementations of the SEDA pattern is Spring Integration, described in the following InfoQ article by Joshua Long: Getting Started with Spring Integration. In an answer to the "Why Spring Integration?" question, Joshua says it best with the following quote: "Because Spring Integration is so lightweight (you deploy the Spring Integration server with your application; you don't deploy your application to Spring Integration) and so focused on the development life cycle (XML schemas to facilitate configuration, POJO-friendly APIs, and strong integration with the Spring Framework and JEE), you'll find Spring Integration a better fit than a lot of other ESBs." Spring Integration supports a pluggable transport layer, where a lightweight, highly scalable Message Broker, such as Apache ActiveMQ, can provide support for the network architecture that allows for the horizontal scalability of the Spring Integration solution.

So why all of these tools from SpringSource (Spring Framework, Spring AOP, AspectJ, Spring Integration, and ActiveMQ)? Traditional Enterprise Integration products promote a proprietary development and deployment model that requires a steep, costly organizational learning curve to successfully adopt. In addition, the more successful you are at adopting these development tools and deployment models - the more locked in to those proprietary products you become. The SpringSource mantra is Eliminating Enterprise Java Complexity. Using their standard Java development tools and deployment models, SOA can be incrementally adopted in a lower risk, more agile way - led by the Java developers and systems analysts you already have using the tools they already know (and love). The end-result of this incremental adoption is simply a re-factored version of your current Java and .NET business applications. Finally, the Spring Framework, which serves as the foundation for all of these tools, ensures both the tight integration and full portability you've come to expect from any SpringSource solution.

As I stated in the very beginning of this series, SOA is an architectural pattern - not an expensive suite of software products. A pattern, by definition, is the encapsulation of a complex, dynamic system into a reusable component. Patterns are meant to describe, through a white-box approach, best practice ways for you to build YOUR solution - they are not meant to be solutions in of themselves. Patterns therefore lend themselves to be best implemented by lightweight, embeddable frameworks that serve to support your solution, not heavy, commercial-off-the shelf products that aim to control it. The integrated Spring components, backed by the unwavering commitment to simplicity of the SpringSource company, provides you with the Agile SOA solution you need to support the evolutionary nature of your business.

Thursday, December 18, 2008

Agile SOA - Part II

After establishing the forces behind the paradigm shift of Service Oriented Architecture (SOA) in Part I of this series, I'd like to talk about how we should now be approaching enterprise architecture from the perspective of software architecture in this article. SOA is an architectural pattern for how to build software that better supports the evolutionary characteristics of your business. The emphasis within this discussion will be around the term "evolutionary", and how agile methods best support the evolutionary development of software.

We'll first start by contrasting the "evolutionary" aspect of modern software architecture against the "predictable" aspect of modern hardware architecture. Moore's law best describes the predictability of modern hardware architecture: that almost every aspect of digital electronic devices (processing speed, memory capacity, etc.) are all improving exponentially, doubling approximately every two years. The trend was first observed by Intel co-founder Gordon E. Moore in a 1965 paper, has continued for almost a half of a century, and is not expected to stop anytime soon.

Modern software architecture has no such predictability. It is best described by the theory of evolution, well-known within the domain of biology. The primary reason for this is that software is used to model the knowledge and processes that form the foundation of a business. This intellectual property (IP) that gives a business its competitive differentiation comes from the coordinated effort of its people. In biology, evolution is change in the inherited traits of a population of organisms from one generation to the next. Evolution occurs naturally within any biological population, so it is natural to expect that that the knowledge and processes guarded by a group of people will evolve as well, and therefore that the software that models that IP must have the chief characteristic of "evolvability" in its design.

Whether you fall on the side of Evolution or Creation as it applies to the origins of humanity, no one can deny the fact that software is created. One could argue if all software was indeed created by an "intelligent designer", but that is the subject of another blog. Designing software to best support its own evolution is by far the hardest thing a software architect must attempt to do. While evolution is a relatively simple thing to comprehend, it is not even remotely simple to design for.

Service Oriented Architecture is the next step along the path our industry has taken since its inception to achieve "evolvability" in software design. At first we categorized and organized data "in-situ" with structured programming, then we began to add behavior and encapsulate both data and its behavior as objects with object-oriented design, and now we are making behavior a first-class design concern with SOA - pushing data to more of a "behind-the-scenes" supportive role. This is because the knowledge that forms the competitive advantage of a business is derived from the interpretation of data created and managed through the processes with which the business provides value to its customers. Data is only a means to an end, it is in the successful interpretation of data that a business lives or dies. That successful interpretation is inexorably linked to the evolving environment in which the business is competing. Safely and securely nurturing and advancing the IP of a business is indeed akin to "survival of the fittest" within the free market economy.

So why is Agile the best method to use when approaching the evolutionary characteristic that is the chief concern of SOA? Simple, because while those of us who architect software may be intelligent - we are not omniscient, and cannot predict the ways in which our software and its use within the business context will evolve over time. Whenever I discuss Agile, I like to go back to the 4 basic values that were recognized by Kent Beck in "Extreme Programming Explained: Embrace Change", Addison-Wesley, 1999. These values are Communication, Simplicity, Feedback, and Courage.

Communication. "The first value is communication. Problems with projects can invariably be traced back to somebody not talking to somebody else about something important." 

An Agile approach to SOA is all about establishing, and continually improving, the communication between business and IT. Architecture is the way in which those of us involved with software design communicate to our business counterparts. This communication is meant to be "reflective", in that it should reflect the values, priorities, and requirements initially shared with us by the business. Perhaps the biggest way in which SOA improves the "reflective" property of IT to business communication around software design is that it de-couples the two concepts of business and application services. Business services are discovered in a top-down fashion from the activities that make up a business process. A business service delegates the automated work related to the activity it is modeling to application services. Application services are discovered in a bottom-up fashion from the existing systems and data that support the business. It is in the mapping of business services to application services that our business counter-parts can truly understand the value provided by their application portfolio, and that we in IT can rationalize the continuing investment from the business for maintaining it.

Simplicity. "The second value is simplicity. The [Agile practitioner] continually asks the question: What is the simplest thing that could possibly work?"

An Agile approach to SOA is all about deriving simplicity from the seemingly complex. Legacy application services are complex, mainly because those who designed them just didn't expect the internals of these applications ever to see the light of day in the business world. Legacy Enterprise Resource Planning (ERP)  applications were a commodity and so long as they functioned properly as a whole within acceptable service levels, it didn't really matter how they were designed internally. However, in SOA, these legacy application monoliths must be broken down into the specific services they provide in order to effectively trace those application services to the business activities they support. 

Simplicity in SOA comes down to reducing the "surface area" of the software we are integrating into our business. Legacy monolithic applications just have too much surface area to consider when attempting to integrate them within an evolving business process. The ERP days were also all about implementing these legacy monoliths "vanilla", with as few customizations as possible - because customizations were hard and cost money. This led to the business being forced to conduct their processes the way the software dictated they be conducted. This rigidity, introduced by IT, led to many businesses falling behind in their ability to compete by successfully adapting to their continually evolving industries. This has left a very bad taste in their mouths about IT (not to mention the debacle of Y2K that yielded absolutely no business value). 

With SOA, it is an absolute imperative that we find the simplest way to keep our respective businesses competitive. SOA promotes simplicity in its design through modularity and flexibility. By flexibly composing more complex business services from simple, modular application service building blocks - we are better preparing that software to evolve along with the business processes it supports. Kent Beck says it best: "[Agile software development] is making a bet. It is betting that it is better to do a simple thing today and pay a little more tomorrow to change it if it needs it, than to do a more complicated thing today that may never be used anyway." Time is the enemy of complexity - only the simplest of solutions will endure.

Feedback. "The third value [of Agile] is feedback. Concrete feedback about the current state of the system is absolutely priceless. Optimism is an occupational hazard of [software architecture]. Feedback is the treatment."

An Agile approach to SOA is all about promoting feedback at both design-time and run-time, creating a continual feedback loop between the changes in the processes that support the business and the responsiveness of IT to those changes. At design-time it does this by improving the testability of software while at run-time it does this by allowing real business results to be tracked and monitored in real time. 

In software, increased modularity leads to increased testability. Testability of software deals with the ability of software to hide errors from detection during testing. More complex software is more difficult to test because its complexity acts as a sort of  camouflage for latent bugs, allowing them to survive longer without detection. Many studies have shown that the longer it takes to detect a bug, the more costly it is to correct that bug. Simple, modular software architecture allows for more effective testing of smaller, and therefore more manageable, chunks of an  application. Since this testing can be focused on the module and performed independently of the rest of the application, it is far less likely to conceal bugs further into the software lifecycle.

SOA also promotes feedback at run-time by linking the work performed by application services to tangible business results. Business services represent activities that add value for the customer within the context of a business process. Run-time monitoring of these business services provides real-time feedback on how well the business is running. If a business service is not running within established service levels, it is also easy to "click into" a particular business service and see if the reliability, availability, scalability or performance of the supportive application services are the source of the problem. Finally, the real-time metrics collected by monitoring a business process can be used to simulate business process performance under load to better understand the impact of changes to that business process while those changes are still under test.

Courage. "Within the context of the first three values - communication, simplicity, and feedback - it's time to go like hell. If you aren't going at absolutely your top speed, somebody else will be, and they will eat your lunch."

An Agile approach to SOA is all about finding the courage to be wrong sometimes, while minimizing the impact of being wrong and optimizing the time to correct those mistakes as they are discovered. Due to the predictable nature of hardware architecture, it wasn't generally acceptable for the first generation of enterprise architects to be wrong. The new generation of architects must be capable of making educated decisions in the trade-off between "perfect" and "fast". If software is going to successfully evolve at the speed of business, we software architects must learn how to facilitate, not impede, that evolutionary progress. We must trust that if we follow the other tenants of the agile approach, that our mistakes will be fewer, of smaller scope, and much easier to detect and correct. The business does not really care if you can unequivocally point the finger at a software vendor as the source of a problem - they only care how long it will take you to diagnose and correct that problem. If you are able to keep pace with the business, then small, quickly recoverable problems with the software will be tolerated by the business. It is only when you cannot keep pace with the business that the business begins to expect the impossible. Your customer's expectations for the software you deliver are directly proportional to the time it takes for you to deliver it -- the longer it takes for you to deliver software to your customer, the higher your customer's expectations for that software will be. SOA gives you the modular and flexible design you need to embrace business change while an Agile approach gives you the most rapid approach to keep pace with that business change. 

So if I've convinced you that an Agile approach to SOA is the right approach, you may come to see the monolithic SOA Suites the large vendors sell as overly-complex and the cumbersome development methods and processes they dictate to you as overly-slow. This will be the subject of my next blog - part III in this series.

Saturday, October 4, 2008

Agile SOA - Part I

Service Oriented Architecture is not an expensive suite of software products as the large software vendors would like to sell you. SOA is an architectural pattern. A pattern is a well-described, proven way of doing something that has evolved out of multiple attempts (successes and failures) across an entire community of professionals. SOA is a best-practice way of constructing an enterprise software architecture to better suit the evolving needs of your business.

The real challenge of adopting SOA is to change the way you and your organization think about enterprise architecture, not to change your information technology infrastructure yet again to continue the cold war-like arm's race against your competitors the software vendors have sold you on (at a great profit to them, I might add).

Why is SOA such a challenge? Simple, because it puts the focus of enterprise architecture on software architecture, not hardware and network architecture. The IT industry is in the midst of handoff between its first generation of infrastructure architects and a new generation of software architects. Businesses, who have come to rely heavily on their architects, are struggling to understand a new sofware-centric view of their technology portfolio. The old days were (somewhat) easy, let the IT guys handle the infrastructure, and we'll handle the business. Investments in technology were relatively simple and straightforward to both understand and manage - and the results were tangible assets that had an expected life and could be depreciated with comfortable precision.

The first generation of architects are extremely good at what they were asked to do - connect hardware through ever-growing and ever-speedier networks. Software, at least the software they intended to run the mission-critical elements of the business on, was considered to be a commodity - just like the hardware and network components they were used to implementing. Their view of software from an enterprise perspective was through the hardware nodes that the software was to be deployed on (this box is for General Ledger, that box is for Accounts Payables, this box is for email, that box is for our website, and so on ...). Rack it, stack it, connect it up, and turn it on.

The new generation of architects don't think of software only in terms of how it is deployed. Disrupting forces like the Internet, Mobile Computing, Virtualization, and Cloud Computing are making physical hardware and networks a ubiquitos (and somewhat abstract) concept. With the changes brought about by these forces, it is not likely that the IT group of the future will even continue to manage physical technology assets within business-owned data centers anymore.

So if IT departments are no longer managing hardware and networks for the business, what will that leave them with? Will IT cease to exist as a business-critical function within the enterprise? The answer, of course, is no (or else I'd be learning a new trade instead of blogging about this one). The answer is that IT departments will "move up the food-chain" within the business - becoming newly responsible for managing and securing its intellectual property (IP).

IP is the beating heart of today's business. It is composed of the knowledge and the processes that form the foundation of a business and give it its unique competitive differentiation within the marketplace. Knowledge and processes are modeled as software, not hardware - and those models can no longer be confined to the physical boundaries of hardware and networks. Look no further than Amazon, Google and other major internet-based companies that have survived the bubble to form the new guard of today's business for proof of this paradigm shift is real.

So how does this all tie back to SOA? SOA is simply a better way of managing the knowledge and processes that form the IP of your business. SOA is a better way because it is a software architecture that, like the knowledge and processes it manages, is not confined to the physical boundaries of hardware and networks.

So, if I shouldn't approach SOA the way I've approached technology investments in the past, by going out and buying some components from a vendor, connecting them up, and turning them on; how should I approach it? This will be the subject of my next blog - Part II in this series.

Friday, June 20, 2008

Revisiting Requirements Management

The number one job of the Architect in an IT project is to manage requirements. The Architect is in the only position within the enterprise to look deep enough into the business and deep enough into the technology to understand the translation of the requirements between the two. It is the Architect who is on the front-lines of business-IT alignment, and people should look no further when IT seems "out of touch" with the business.

A requirement, in its simplest form is the codification of a business need. In this regard, requirements should trace naturally to the goals of a business and the metrics that the business uses to track their performance against those goals. Business goals and performance metrics should trace directly to the business plan - which is the embodiment of the business strategy. 

Requirements management, the domain of an Architect, is about managing the full life-cycle of a requirement; ensuring that the business value (goals & metrics) traced to that requirement is both: 
  1. effectively captured during requirements analysis and 
  2. efficiently delivered by the technology traced to that requirement. 
In this way, all technology and associated investment should trace naturally to business value. 

Therefore, the first half of an effective requirements analysis process is for the Architect to ensure that this traceability to the business plan is well-defined, and that the business context associated with the requirement is clear to IT. Architects are on the front-line of IT strategy and planning, and it is their responsibility to challenge and guide their business counterparts in applying the same rigor to the business strategy and planning process as they apply to the technology strategy and planning process.

The second half of the requirements analysis process is for the Architect to ensure that the traceability to the technology plan is well defined, and that the IT context associated with the requirement is clear to the business. The IT context associated with a requirement can be broken down into three main areas: 
  1. the expected impact of the requirement on the enterprise architecture, 
  2. the planning input for the project required to deliver it, and 
  3. the expected impact on IT operations to support it once it has been delivered.
The impact of a requirement (or set of requirements) on the enterprise architecture should be captured across three main enterprise technology views: the software architecture, the server(or hardware) architecture, and the network architecture. In order to properly trace a requirement from business to one of these technology views, the original requirement needs to be "refined". The process of requirements refinement is a meticulous one, that requires all of the skill of an experienced Architect. Refining requirements involves deriving new, more detailed requirements from the base requirement that are specific to the technology view being considered. This process should not add or remove from the scope of the original requirement nor otherwise alter the original requirement. The motivations, assumptions, and constraints considered while deriving a software, hardware, or network requirement from a business requirement should be well-documented by the Architect. This documentation serves as a feedback mechanism from the Architects to their business counterparts on their interpretation of the original requirement. Requirements refinement and associated documentation serves as the basis for involving the business in technology decisions by providing transparency to the IT strategy and planning process.

Planning input should include the basis of estimate that drives the variable costs, any required fixed costs, and any notable risks associated with delivering the requirement (or set of requirements).  The first half of the basis of estimate should be a quantifiable metric (other than hours) that drives effort. Some examples of quantifiable metrics include the number of expected users, a number of screens to be developed, or the number of entities in the data model. The second half of the basis of estimate should be an estimate of hours, by resource type, associated with delivering the requirement across that metric. For example, it may be estimated that it will take 6 hours of a DBA, 1 hour of a business analyst, and 1 hour of a project manager to implement the necessary physical database objects associated with a single entity. This information can then be easily used to generate the necessary staffing plan for a project to deliver the set of requirements being planned, which would smooth the consolidated effort estimate for each resource type across a set of requirements over the full timeline of the project being planned to deliver those requirements and associate a cost per hour for that resource type to determine the cost for that effort within the scope of that project. Fixed costs from subcontractors or vendors that will be providing specific services and / or products needed in the delivery of a requirement should also be included in the planning input for each requirement. Finally, any risks known at the time of planning associated with delivering the requirement should be well-documented. Included in the risk documentation should be a probability that the risk will be realized and its potential impact on project cost, time, and / or quality.

Lastly in the IT requirements analysis process should be the development or modification of a concept of operations for the requirement (or set of requirements) to be delivered. This concept of operations should include any on-going monitoring, management, or administrative activities required to operate the new software, hardware, and or network components delivered while fulfilling the requirement. Like project planning input, the concept of operations would document the basis of estimate for any ongoing variable costs, any regular fixed costs and their timing interval, and any risks associated with operations in support of the components being delivered to meet this requirement (or set of requirements).

Once proper analysis has been done for a requirement, an Architect's job is not over. It is the responsibility of the Architect to monitor the delivery and ongoing operations of each requirement for variances to plan. Variances in cost and time (efficiency) during delivery or operations, including the realization of risk, should be analyzed by the Architect to determine if they are one-time or systemic in nature. Systemic variances in IT efficiency discovered over time should be re-factored back into the requirements analysis output with proper change management controls in place to notify interested business and IT stakeholders. Finally, it is the Architect's responsibility to periodically review the business goals and performance metrics that each requirement is tied to to ensure that expected business value has been delivered by IT. Variances in delivering expected business value (effectiveness) should also be analyzed by the Architect to determine if they are one-time or systemic in nature. Systemic variances in IT effectiveness discovered over time should be re-factored back into the IT strategy and planning process with proper change management controls in place to notify interested business and IT stakeholders.

Architecture today often has the misconception by business and IT stakeholders as being a project-based activity. However; true requirements management, which is at the heart of all modern architecture, involves the regular monitoring and course-correcting (governance) of the business-IT alignment. Requirements management defines architecture as an on-going relationship activity between business and IT; one that should not be confined to the delivery of specific IT projects.

Monday, May 19, 2008

ReFactoring Enterprise Architecture

I was listening to David Linthicum's podcast from last week ( and got to thinking more about the problem of Re-Factoring Enterprise Architecture. Re-Factoring your Enterprise Architecture is a variant on the Enterprise Modernization angle we have been pursuing within the technology industry for legacy platform customers. The reason Enterprise Modernization is not enough in my mind is that it is only really targeted at customers (like the government) who are still on homogeneous (albeit legacy) platforms and due to personnel training and retention issues want to stay that way [homogenous]. Most commercial customers likely have a highly heterogenous environment or want one to better keep up with the heightened competition within their industry.

Most commercial enterprises would agree that their Enterprise Architecture has organically grown over the years in a similar way to how the Amish sew together a patch-work quilt. This is the problem of "shopping for your EA solution".

On the left-side of the "shopping for an EA" continuum, you have customers that consider themselves early adopters and are willing to try new things to "get an edge" on their competition. [insert your favorite market guru here] tells these customers that they should have a portal, and that Plumtree is one of the best point-solutions out there so they buy that. [insert your favorite market guru here] tells them they should have a framework, and that SilverStream is one of the best point-solutions out there so they buy that. [insert your favorite market guru here] tells them they need an ESB solution, and that Cape Clear is the one of the best point-solutions out there so they buy that. And so on. These folks may be fast out of the gate, but lose momentum over time, due to inefficiences caused by lack of integration along the way and the rapid turnover of products (and vendors) caused by the fickleness of the commodity software industry.

On the right-side of the "shopping for an EA" contiuum, you have customers that consider themselves more conservative and place a high value on staying with a single vendor. These folks are still "shopping for an EA", except that they wait to be told what to buy and when to buy it by IBM or Microsoft. These folks are always dealing with repressed feelings of frustration and doubt caused by their inability to keep up with software market innovation because their chosen vendor isn't getting them there fast enough.

In my opinion, customers at either end of this continuum can be refactored toward a planned Enterprise Architecture, highly customized to their specific needs, that can deal with change in a "systemic" or "repeatable" fashion through the combination of:

  1. the strategic use of Open Source software to get at the core problems within the IT portfolio by either better glueing the pieces together or more rapidly extending the functionality of the monolith (with either approach based on open standards) - making the IT-side of these commercial customers more agile while also providing a forward-looking context for better supporting their engrained spending habits, and
  2. the tactical use of Governance solutions to give the business-side of these commercial customers visibility into the measureable (metrics-driven) progress they are making towards the goals that define their reasons for investing in IT to begin with.

To summarize, I like the "ReFactoring your Enterprise Architecture" angle because it deals with the heterogeneity (or lack thereof) that likely exists in most commercial enterprises while still sending the "we aren't here to change you ... just make you better" message that Enterprise Modernization sends.