Friday, October 15, 2010


Enterprise Architecture and Agile Operations

In a recent discussion on Twitter, a few people asked how deeply should enterprise architecture practitioners oversee IT operations. Many people drew the line in saying that EA should manage supply and demand. Of course, I disagree with this line of thinking and wanted to share my thinking as to why...

Have you ever heard of the innovation paradox? It is the phenomena that states that in operations, you want to remove as much deviation as possible. This is what helps bring consistency in service. Many IT operations folks have gravitated towards ITIL as a potential solution. The innovation paradox also says that in order to maintain business competitive advantage, an enterprise needs to do something different other than the status quo (aka current operations). Innovation requires increasing deviation while operations requires deviation to stabilize.

Do enterprise architecture teams ever ask themselves the exact opposite question where they explore what would happen if they periodically allowed operations folks to be well a little less operational in their thinking? Why are we constraining innovation to only a few stovepipes? Many EAs believe in Agile methods for software development and are champions of XP, Scrum and Kanban yet haven't put on their thinking caps as to how to bring innovation to operations. Well, since many EAs are blissfully ignorant towards operations, I figured I would throw out a few innovative ideas that can help enable an agile operations ecosystem.

I remember hallway conversations where I bought into an eye that I now believe was 100% wrong. It is way too easy to get caught in a conversation where IT operations folks desire to slow down the pace of innovation and encouraging releases of software to go slower instead of stepping up and figuring out how they can improve their own practices. In many shops, IT operations is even more sinister than information protection departments.

First and foremost, it seems as if pretty much every developer understands the value proposition of version control, yet the practice of being able to version operational aspects is lost on most shops. Enterprise architecture should mandate that a systems administrator not only version control all patches, but mandate that they have a clue as to how to roll them forward and back before release.

Second, if you ever needed to make a mass change to servers quickly, do you really need an enterprise class patch management tool? What would happen if you could borrow techniques already used on the Internet and do so for free? I am of the belief that the best way to distribute code to thousands of servers simultaneously would be to leverage BitTorrent software, you know the stuff that is blocked by those folks who think they are doing security but are really from the hygiene department?

Third, don't you find it even remotely curious why a large enterprise that has 8,000 servers requires almost 1,000 people in their infrastructure group yet the likes of Facebook can run 50,000 servers with only 60 people? Is it because the operations folks in Facebook are specializing generalists who are part of the same team as software developers?

What do you think happens when you not only co-locate business people with developers but also throw in a few operations folks as well? In this scenario, the operations folks may have a clue as to how an application actually works and may even know the developers who wrote the application personally and therefore have a better shot at faster resolution that the arduous crisis meetings I have witnessed where the blind is leading the blind towards a coordinated resolution.

The interesting thing about facebook is that they also haven't held onto the broken model of quality assurance as advertised by CMMI level 16 shops in India. Ask yourself why can't software developers when surrounded by and co-located with operations folks who have version control simply skip the QA cycle?

Maybe the challenge is that operations folks don't have a clue as to how to incrementally release software. For example, wouldn't it rock if your load balancers could route light traffic for a new applications to a few servers that are monitored by developers vs the big bang release done today?

There are of course a few flaws with the above idea. I think it means that a developer would need to see a release into production and not simply throw it over the wall to operations. Imagine if I couldn't go home at night until the release was completed successfully as tested not by QA but actual real users? Do you think both developers would right better QA and operations folks would have better confidence in the software they are given...

Anyway, some refinement is in order and I am even thinking about making this a blog series to release additional insight into agile operations. In the meantime, if you are a blogger, please write a response in your blog and link back to this entry. For others, please share your thoughts on Twitter regarding agile operations...

Links to this post:

Create a Link

<< Home
| | View blog reactions

This page is powered by Blogger. Isn't yours?