We had an interesting outage today on one of our client's websites. Out of nowhere, the website was inaccessible. The website runs by itself on a dedicated physical Windows 2003 R2 server (probably overkill, I know, but that's a discussion for a different day). After restarting IIS and ColdFusion Application Service, the problem came back several times. My initial thought was that it was a DNS issue, which happens occasionally - the last time it happened was after Hurricane Sandy when we our ISP was out, and we had to make some network config changes. But, it was not a DNS issue. My second thought was that it was a DDOS attack, but, there's very little reason anyone would want to take this site down. When we called our ISP, the operator on the other end noted that traffic was spiking significantly. As it turned out, the client had unintentionally caused a DDOS on the website, after they FTPed a very large video file, and then mass emailed a link to it. Hundreds of people clicked the link and brought the site to its knees.
I am primarily a Website Programmer, but I often have to contribute to server administration at times. Sadly, I'm the resident ColdFusion and IIS expert, but I don't have a lot of experience with this issue. What are some basic steps that I can take to prevent this from happening in the future, since we cannot always control what files the client posts to the website.
Here are some ideas I had, but I'm unsure of the impact:
- Limit the number of connections in IIS.
- Put media files on a separate server (like an Amazon site, etc.).
- File requests of this type currently behind a server-script (i.e. /www.site.com/viewFile.cfm?fileId=1424545, where the fileId references a file off the webroot) that logs requests, and pushes the file to the browser using CFCONTENT. I could edit this script to reject requests when they exceed a certain amount in a given time-frame (i.e. a 5MB can be accessed globally 10 times in an hour). This may cause some users frustration, but, if hundreds of users are attempting to view the file, the site is going to crash anyways, as it did today, which is way more frustrating, since there is no "pretty" message explaining why they can't get to the file.
- Update Request Tuning settings in ColdFusion Administrator.
Maximum number of simultaneous Template requests
is currently set to 20. I could reduce this number to something like 5 just to prevent occurrences like this, but that would likely have an adverse affect on normal use of the website.
I'm open to any suggestions, as I'm continuing my research to report to the CTO with the best options, so that we can put a solution into effect.
Thank you.
UPDATE: Usage Report from the time surrounding the outage:
Comments
Post a Comment