This 1 Error Shut Down Popular Sites For Up To 5 Hours


1 error, 1 wrong stroke of the keyboard, 1 simple mistake caused sites like Slack, Expedia, and the Securities and Exchange Commission to go down for up to 5 hours last week. AWS, or Amazon Web Services, is a cloud based infrastructure provider. Countless organizations large and small rely on AWS’ services to run their businesses.

Last week AWS’ Simple Storage Service crashed. Not only did it crash, it stayed down for approximately 5 hours for some customers. AWS could not even access their dashboard to update affected customers. Customers had to follow AWS’ Twitter feed to find out when their sites would be restored.

The reality is this error could happen to anyone. In fact, the cause of this AWS crash is something you have in your office right now. So how could one of the largest IaaS providers experience such a crippling error? Humans.

human errorHuman error was the source of AWS’ outage last week. An employee, who was following protocol, made one keystroke error in his code. That keystroke error shut down hundreds of customers for over 5 hours. Sites like Expedia, Slack, and Medium had to cease operations until AWS could restore its servers.

Human error is not 100% preventable but you can put systems in place to protect you and your business should human error occur. Check out Datasmith’s recommendations for protecting against human error below and contact us to see how we can strengthen your defense against human error and other IT threats:

1) Share Procedures

AWS could have had an even bigger problem on its hands if the employee hadn’t been following procedure. Since the AWS employee was following a standard procedure the mistake was easy to track down. Make sure your employees understand proper IT protocols for sending emails, opening attachments, saving files, accessing the remote network, backing up servers, etc. That way should something happen we can retrace your steps to find the source of the error.

2) Have A Back Up

At Datasmith we cannot preach this enough: have a back up! Always, always, always have a back up of your website, data, and infrastructure. That way if something happens we can spin up a virtual environment so while we fix the issue, you can continue to operate. Services like Axcient are easy to use.

3) Protect Against Outside Threats

Although technically not a human error, most human IT errors can leave you susceptible to attacks from viruses or malware. Making sure your firewall and anti-virus are up to date is a good start. For larger organizations, investing in penetration testing can identify potential weak areas in your infrastructure.

Although we cannot guarantee we will ever be able to 100% prevent human error, we can guarantee that with the right systems in place, a human error does not have to mean hours of downtime. If you’re thinking about IaaS, do not let what happened with AWS deter you from investigating further. IaaS is safe, secure option for any business and AWS is just one of many providers. Contact Datasmith with any questions, concerns, or thoughts in the comments below!