Surprise! The cloud can fail!

Print E-mail
Technologie - Général
Friday, 22 April 2011 14:45

If you did not read the news , perhaps you noticed at how multiple services from different companies were failing at the same time: Foursquare, Reddit, Hootsuite, Quran and several hundred companies failed or suffered problems with the collapse of the Amazon Elastic Compute Cloud (EC2) , a cloud computing platform used by a lot of services.

Apparently, the promise of independence between systems that Amazon offered as security and stability failed miserably redundancy : several systems located in geographically separate locations at the same time failed because, it seems, to a process of uncontrolled backup copies made ​​countless itself , in a cascading effect that quickly consumed all available space and resulted in what has been termed the "cloudgate" or "cloudpocalipsis." Something, indeed, should never occur, and that raises questions of all kinds on the maturity and development of cloud computing as a whole.

Or not? In reality, what happened is somewhat different to the downfall of a power plant? Or the failure of a water drinking water supply? If something we know about technology is that it is impossible not to fail, and what we do is take appropriate measures so that, when it fails (not "if it fails, it fails because it is something that can be classed as metaphysical certitude ), the effects of failure may be the least serious. The power failure at my house often enough to decide for years to acquire a modest uninterruptible power supply (UPS) for domestic use , and I know that such failures are quite common in the lives of many people, not just in Spain but in other countries where I lived. When it fails, is a major hassle in your daily life, if not a little catastrophe due to problems of all kinds. And if you call the company, they excuse and basically tell you that, which is a bug and can not do anything, that things go wrong from time to time. And we talk about services such as light or water, carrying with us many, many years, in which we are confident and on which we build many aspects of our lives, about reliability we take for granted.

Okay, the verdict should not have occurred. As we said before, the cloud is so good - or bad - as good - or bad - are your suppliers . There is no cloud, there are companies that provide services in it. Companies in which encrypt certain levels of confidence, estimate and evaluate risks, avoiding both an end (stay consistently in the open) and the other (to spend more than what the risk actually may even imply). Both represent defect and excess problems, ranging from service disruption and loss of reputation to the extra cost. The technology, surprise surprise, can fail. If the possibility of that failure is crucial to your company, redúndala, preferably with different suppliers. A service like this blog you are reading has a number of early warning systems, several alternative procedures in case of fall within my hosting provider, promotions, and yet, despite receiving similar treatment protocols Acens to customers with a service criticality infinitely greater than mine, it is even a daily backup on Amazon. And what if all else fails ... I get almost the same, because the service provided by this page can be anything but critical. The possible impact of a fall from a full day of my blog is practically nil, because the next day, my readers, surely, will still be there: each day I play a lot more in terms of what could happen inside my head and therefore leave my keyboard, what could happen in my server.

The important thing is to consider a fall as this, which occurred at a time of low impact (in full holiday period and one of the lowest traffic day of the year) as something to learn. For Amazon, understanding that failure -in an order - can happen, shit happens, but should not miss other key elements such as communication . For those who have really critical processes with significant impact on the transaction, to translate into economic value, which must lead to the extent it may alleviate at least some of the potential harm , and that this analysis is not a napkin account was once a ride service, but a dynamic analysis based on the different options available, the development of their cost, that of our trading volume, etc. A risk analysis, cost / benefit, which can not be neglected.

AWS has allowed us to scale to complex system Quickly, Effectively and Extremely cost. At Any Given point in time, We Have 12 database servers, 45 app servers, six static servers and six analytics servers up and running. Our auto-scale systems when to spike Requirements traffic or processing, and auto-shrink When not needed in order to conserve dollars. In the ten months Since We Launched the public beta of our free, self-serve platform gamification We Have Handled over one billion API calls. Without AWS, That Simply Would Not Have Been possible with Our small team and limited budget. Many Others Have similar Realized Benefits from the cloud, and AWS has Quickly Become a critical part of the startup ecosystem.

Keith Smith, CEO of BigDoor , affected by the fall of Amazon Web Services (AWS)

Indeed, Amazon Web Services (AWS) fell. No system is one hundred percent error free, and there are many lessons to learn from this. But not Amazon, many things would be simply impossible. It is simply a balance of cost versus benefit.

For Amazon, the ruling will have a significant injury . Many things can go wrong, but you should not miss is the essence of what promised to your clients (systems completely independent) or your communication with them. Cloud computing is in its infancy, and see failures like yesterday on numerous occasions. But as tangible as those same mistakes are their advantages in terms of scalability, flexibility, cost, performance, efficiency and many others, to the point of becoming key advantages that define, for many companies, or may not be true, the decrease entry barriers that make many things that otherwise would not be possible may indeed be possible. Which is not to say that, like everything else, from time to time may fail.


Font