The 11 Worst Cloud Outages(Fiascos)

As cloud-enabling technologies advances and cloud vendors become more refined, the upkeep of mission-critical workloads has to also improve. It can be a nightmare even if a short service interruption occurs within the cloud infrastructure. In reality, this has happened in a number of occasions – resulting in a serious harm to business operations.

The severe impacts arise because almost every business is embracing cloud-enabled technologies and with each service interruption, no matter how fast it is rectified, it translates to mega business outages and losses. No matter the continued advancements to guarantee uptime and service maintenance, cloud outage fiascos are still very common.

The causes range from power outages, overloaded servers, and database errors to faulty software updates. Still, establishing the true nature of failure remains a mystery among cloud administrators. Here are the 11 worst cloud outage (fiascos).

Twitter, January 19, 2016
Table of Contents

One of the top rated social messaging platforms, Twitter, experienced worldwide problems on the morning of January 19, 2016 after uploading a defective code. The internal software update caused mobile applications and the web to fall for a long period of time. The outage went on for about 8 hours and Twitter administrators later confirmed the bizarre occurrence. The 8 hours felt like eternity.

Verizon, January 14, 2016

On January 14, 2016, there was an outage at Verizon data center that negatively impacted the business operations of JetBlue Airways, delaying flights and obligating a substantial number of passengers to rebook. The outage was attributed to power outage at the Verizon data center. New York-based JetBlue reported that the airline was subjected to network problems due to the Verizon data outage, which seriously impacted mobile applications, check-in, customer support systems, a toll-free phone number, and airport counter systems.

Microsoft Office 365, Jan 18, 2016

For several days – beginning on January 18, 2016, some Office 365 users could not access their cloud-based email accounts. As much as Microsoft tried fixing the hitch immediately they realized the problem, the first attempt failed to fix it. In actual fact, it even worsened when another volley of email failures irritated users five days after the primary outage was experienced. In some cases, the problem persisted for more than a week. As much as the outage didn’t affect the entire Microsoft community, those impacted suffered serious downtime. Microsoft reported that the outage was due to a buggy software update.

Symantec Cloud, April 11, 2016

Symantec Cloud is a portal that provides customers with cloud-based security services. It experienced an outage on April 11, 1016 for about 24 hours begging around 6 a.m. in the morning. Cloud support engineers had to work on the system for the entire day to restore portals configurations so as to bring back the database online. The provider’s status page kept on displaying apology message to users the entire time. The outage made it cumbersome for Symantec customers to administer some emails and web security services but the provider insisted that the overall cloud infrastructure remains protected and secure.

Google Cloud Platform, April 11, 2016

The Google Cloud Platform was hit hard with an outage for 18 minutes on the evening of April 11, 2016. The outage majorly affected VPN services and Compute Engine instances in all its coverage regions. Google awarded the affected clients service credits of 25% of their monthly VPN charges and 10% of their monthly Google Compute Engine charges.

Salesforce, March 3, 2016

On the 3^rd of March 2016, a portion of salesforce customers in Europe had to bear a CRM service disruption for about 10 hours. The outage was due to a storage problem across the region’s offering. Even after the storage tier was connected afresh, some features of the CRM were still unstable and it continued to report reduced performance on its EU2 component. The outage affected customer service operations of tens of thousands of businesses.

Salesforce, May 10, 2016

On a separate occasion, the 10^th of May 2016, Salesforce.com again experienced an outage for four solid hours. Remedying the problem took about four days. During that period, tens of thousands of businesses were heavily affected as they could not execute customer service operations. The Salesforce.com CEO took the initiative to apologize to customers about the database failure. The most affected region was North America.

Amazon Web Services, June 4, 2016

Even Amazon Web Services does not escape this list. On the 4^th of June 2016, Amazon Web Services went down. In particular, the outage was in Sydney, Australia as a result of heavy storms which thwarted the regions power thereby contributing to the cloud failure. The EC2 instances and EBS volumes hosting vital workloads were the ones that subsequently failed in consequence of the power disruption. Online service and websites went down throughout the entire Australian AWS for about 10 hours. Almost everything using the service was disrupted, from banking services to pizza deliveries. Customers were very disappointed as AWS worked to repair the hitch.

Apple, June 2, 2016

It may also be quite surprising to find Apple on this list. Well, Apple’s cloud equally went through a far-reaching outage in the month of June 2016. Consequently, Apple’s exceptional and popular backup and retail services were taken offline on the account of the outage. The failure made it impossible for some users to access multiple App store and iCloud services. Other instances that experienced major disruptions included the Apple TV App Store, Mac App Store, Apple’s cloud-based photo service and iTunes.

Microsoft Office 365, Feb 22, 2016

Microsoft acknowledged on Feb. 22, 2016 that customers in Europe had a rough time accessing their email from their mobile phones and the service has been unavailable for a sustained period. These types of outages highlight the problems that can occur with cloud-based services, which are increasingly becoming popular with organizations looking to shift from the hassle of on-site systems management.

Dishonorable Mention – Pokémon Go, July 6, 2016

Pokémon Go is a classic example of cloud computing and that’s the reason it joins this list. Since the inception of the Pokémon Go game, there have been a series of outages. These outages have impacted the fun of playing the game and at times, it even disconnects players during monster hunt. Since July 6, 2016, the game has had constant outages due to overloaded servers and database errors.