Development Server Sprawl

It is rare for a company’s production environment to be fully geographically redundant to such a level as to fully take advantage of remote locations to provide fault tolerant high availability. More frequently some element of the application suite’s stack of technology will be unable to seamlessly survive a site-wide outage, be it the aging ERP or the spaghetti of heritage applications that were glued together over the past decade and now called an integrated product offering with a glitzy web front end. Recognizing the reality that most applications in production are highly available only by the design of the infrastructure residing in a single site, and possible, and hopefully, recoverable over some extended time period in another site, leads most organizations to require their production datacenters to provide considerable resilience. Whilst many organizations will not invest in true highly available datacenter facilities with dual systems capable of being maintained and failing without impacting the technology they support, many organizations are investing in substantially redundant facilities, dual PDU and power sources, multiple network providers, redundant cooling. The proliferation of facilities redundancy has created a scenario where data centers space is costing upwards of twenty five thousand dollars per delivered kilo watt of power available to support the compute load, a typical 2.5MW facility costing in excess of sixty million dollars capital before the first server is racked. The price differential between high and lower resiliency facilities is considerable, so too are the operating expenses if one assumes no incremental cost when the facilities are taken offline for maintenance.There are few large corporations in the US that are not currently facing the tough decision of how to sustain the growth of their technology footprint while surviving an economic downturn. Most companies with substantial datacenter capacity are watching closely the erosion of their excess datacenter capacity and contemplating the next multi-million dollar investment required to establish the next area of raised floor.
Virtualization as an approach to server consolidation is no longer new and exciting; most CIO’s have established programs and are gradually reaping the rewards. However, virtualization only delays the inevitable day that the space power and cooling runs out of capacity.A challenge with many internally driven infrastructure consolidation initiatives is that the very people who are charged with working out how to decrease the scale and complexity of the environment are the very same people whose livelihood depends upon that complexity and scale. Their ability to objectively pursue the full extent of the opportunity must be challenged to ensure they are motivated to implement the appropriate strategies for the company and not their career.
As I have blogged about previously the developer community, of which I am a proud member, is a bunch of server huggers. Each team, and if they got a chance each developer, asks for their own environment to develop, test, debug, stage promote and finally run in production. Expecting developers to efficiently develop in a shared environment is unrealistic; the complexity of single application debugging is sufficiently high that the introduction of multiple application development environments within a single server instance would provide a level of ambiguity that would reduce productivity unacceptably. So whilst I don’t support the notion of every developer needing their own set of servers, I do support the notion that providing teams isolated environments where only their own actions affect the behavior of their technology is paramount if you wish them to perform efficiently. Why though do development environments get hosted in datacenters? And more specifically why do they get hosted in datacenters supporting production environments? And to be even blunter, why are they being supported by resilient facilities? Let’s face it, development environments only need to be working when a developer is working, is your office complex supported by facilities as resilient as the datacenter? I doubt it. If a development server environment is offline does that prevent the developer from continuing to code on their desktop/laptop? No. So I posit that development server environments should be supported on relatively non resilient facilities with a fairly low RTO/RPO, perhaps measured in multiple hours. What is the size and scale of development server environments?
A recent survey by a company I was working at demonstrated that approximately 18% of the server inventory physically residing in the highly resilient data center space was not supporting production equipment. If one conservatively estimates that half of these servers are development servers this suggests approximately 9% of servers in the datacenter do not need to be there.
One might logically suggest that large organizations should build their own low resiliency facilities to house their development environments; this seems the logical next step. However, how many companies truly have the scale to fill a data center with development serves alone? Tiny datacenters tend not to be as financially efficient both in capital and operating expense as their larger counterparts due to the economies of scale of the mechanical and electrical plant.
The scale of development server needs is exaggerated by the general absence of rigorous decommissioning or repurposing policies. Due to the one server-one use mindset that caused the proliferation in the first place few application development environments are decommissioned until the application is decommissioned from production. These behaviors have led to a bloat of underutilized processors sitting idle waiting for an app to break or a feature to be developed.If an organization has one thousand servers in its datacenters then maybe 80 of them are development servers, those eighty are consuming probably between twenty and thirty kilo watts in aggregate, the capex necessary to build the facilities they exist within equates to roughly half a million dollars. The capital depreciation of datacenter investments is complex due to the varying nature of the building, mechanical and electrical plant, a reasonable estimate is a ten year depreciation period. The estimated annualized depreciation of the space occupied by the development server environment could then be calculated to be fifty thousand dollars.
Virtualization results in tangible savings for most organization firstly from its ability to enable consolidation of servers onto fewer CPU. Over allocation, the ability to slice up infrastructure so that the sum of the virtual parts is greater than the physical whole, varies considerably depending upon application type and operating systems, as well as the underlying infrastructure that is the source and destination for the environments. It has been my experience that development environments are especially suitable to substantial over allocating. This is facilitated by several, facts: Most development servers are substantially idle, due to their activity depending upon human interaction by the developers, and the lifecycle of development being sporadic depending upon a longer term product development lifecycle. Additionally, the performance of development environments is far less critical than that of the production environments enabling developers (begrudgingly) to accept lower performing environments than the target production environment. Server consolidation rates of twenty to one are not unreasonable to expect, especially if the target environment is using modern multi-core chipsets optimized for virtualized environment execution.
Applying the twenty to one consolidation approach to the eighty development servers leads to the suggestion that four new physical servers can provide the processing necessary to support the development server environment and still provide adequate performance and isolation to ensure the developer’s productivity is sustained. The capital outlay for four servers is probably 100k, depreciated over 4 years equaling a 25k anual operating expense. The additional operating expense of powering and cooling these four servers is roughly equivalent to 35,000khw per year assuming a conservative PUE, this becomes roughly $2,500. The annualized savings opportunity for this environment is thus conservatively estimated to be $22.5K. This however is a gross over simplification and under estimation as I will continue to discuss later. Hopefully though this potential savings opportunity wets your appetite sufficiently to warrant you continuing to read on.
What would be an ideal development server environment? Many competing demands are at play in development environments, cost, flexibility, isolation, availability. The consequence of attempting to mitigate these demands leads infrastructure teams to establish an ever increasing foot print of servers as each team, project or consulting group comes in and demands that their own unique environment is established early in a projects initiation.
At any one time it is unlikely that every production application is being modified by the development teams. This leads to the obvious conclusion that multiple development server environments are sitting idle at any particular moment in time. Why are these servers not repourposed for the next development effort? Depending upon the capital appropriation practices of your specific organization it is likely that asset ownership becomes an issue. One specific cost center owner has the development assets sitting on their balance sheet, and this inhibits the opportunity for sharing. The unpredictable nature of discovering issues in production that will need rapid diagnostic and development work to take place, coupled with the relatively slow deployment times or repurposing times of physical servers makes the idea of dedicated environments attractive to teams responsible for break fix work.
A number of years ago VM Ware added a product to its portfolio called Lab Manager. Lab Manager provides a rich user experience to establish and maintain complex environments made up of a combination of guest VMs, it provides the ability to quiesce the guests and capture then store the state of entire application development server environments and rehydrate them on demand. This technology then creates the starting point for mitigating several of the factors that have driven technology organization to bloat their development environments. First the rapid deployment of rarely needed server configurations becomes an efficient and practical exercise. Secondly the decommission of retired environments becomes a right click exercise. Thirdly the establishment of entirely new application development environments take minutes not weeks.
It is challenging for many corporations to change their policies for infrastructure investments without it having substantial impact on reported top and bottom lines of individual lines of business. The migration from individual business unit asset ownership towards a shared infrastructure can get bogged down in financial performance restatement and other financial issues, irrespective of the obvious benefits from implementing the strategy. It then becomes attractive to look outside of the organization for a less financially intrusive solution to the challenge. The economics of shared virtual infrastructure provide the opportunity for an external business to provide on a subscription basis access to shared virtual development environments for a lower cost than most business units would experience if they purchased and hosted their own physical servers. If instead of selling a subscription to individual guest virtual servers the infrastructure is sold on the basis of capacity to execute concurrent guests a business can then truly optimize its infrastructure usage to a minimum by providing each development team the ability to hydrate a specific number of virtual guests concurrently. This need then becomes independent of the number of applications actually being supported and instead becomes proportional to the concurrent activity of a development team. Not only does it provide the opportunity to decrease capex and opex, it also provides the opportunity for the rapid spin up of new development initiatives without the delays that can sometimes be imposed by the time taken to establish development server environments, providing a hard to quantify but very real time to market advantage for product development teams. The removal of development environments from expensive highly resilient data center space allows for production server capacity growth, another hard to quantify yet equally important revenue growth supporting opportunity.
My next post will deal with responding to some of the commonly cited objections and resistance to the idea of virtualizing development environments as well as moving infrastructure outside of the confines or security of an enterprise’s own datacenters.

Previous Post

Next Post