The cloud embodies deeper risks than mere questions of outages and security, according to a Yale University computer scientist, and we need to be exploring them now.
Yale's Bryan Ford has a preprint up at the arXiv of an article under submission titled "Icebergs in the Cloud: the Other Risks of Cloud Computing." Ford outlines three general concerns about the cloud that go beyond the concerns usually discussed:
These risks may not be apparent yet, Ford notes, but could emerge over the longer term.
The stability question arises because feedback loops could develop when different parties in the cloud space provide reactive controllers at different (and mutually opaque) levels in the stack. The example Ford provides is somewhat simplistic but serves to illustrate the concept. Imagine that a cloud client sets up a load balancer across two servers, and that the cloud provider has enabled a power balancer across the two hardware platforms on which those servers run. If the periods at which these separate mechanisms run are the same, and they fall into resonance, a positive feedback loop could result, bringing down both servers (or more).
This general class of problems could exist at various levels and scales in the cloud stack, and it would be difficult or impossible to test for in advance of actual deployment. The proprietary secrecy in which cloud competitors operate would make proactive detection difficult in any case. Ford writes, "We might well risk deploying a combination of control loops that behaves well 'almost all of the time,' until the emergence of the rare, but fatal, cloud computing equivalent of the Tacoma Narrows Bridge."
The availability problem involves unsuspected dependencies, again complicated by the often proprietary nature of cloud operations and provisioning arrangements. For example, a client's attempt to provide fail-safe storage could be thwarted if the separate storage providers that were chosen in turn relied on a common underlying network provider. A failure of the latter could take down the two upstream storage hosts on which the ultimate cloud client was relying to provide redundancy.
Such "complex interdependencies [hiding] correlated failure modes that do not become apparent until the bubble bursts catastrophically" could, Ford contends, lead to wide-scale meltdowns that could be the cloud equivalent of stock market "flash crashes."
The preservation issue, the most long-term of the three, takes the problem of digital archiving to its logical conclusion in the cloud. How can anyone preserve digital snapshots of some system for posterity when cloud providers reveal neither the underlying code nor the dat on which it acts? It is difficult to see how a cloud-era equivalent of the Internet Archive could ever be developed. How will historians be able to reproduce Google Maps, or World of Warcraft, as they existed in April of 2012?
A Technology Review blog recaps Ford's article, and some of its reader comments shed additional light. A commenter using the nym "exegetor" points out the 1984 book Normal Accidents, by Charles Perrow, which concerns system-level failures. Systems can be characterized by the two dimensions of complexity and coupling. Simple systems that are tightly coupled (example: railroads) seldom have system-wide failures. Complex, loosely coupled systems (example: universities) likewise rarely fail. But complex systems that are tightly coupled (example: the world economy) must experience systemic failures. By this logic, the cloud as it is developing will certainly be subject to unanticipated outcomes.
The cloud is in its early days. We will get a handle on these and other risks of this complex and tightly coupled technology platform, and if some cannot be predicted they can at least be hedged. It will be worth keeping an eye on such academic research into the fabric to which we are about to entrust all of society's treasures.