Monday, July 14, 2008

Stress testing gone wild

Last Friday iPhone launch event became iPocalypse - phone activation server was under heavy load, so heavy it becomes denial-of-service attack.

I am very sure Apple did the full scale stress testing before the launch. However, I think this iPocalypse is more a project planning mistake than the technology issue. Why does Apple decide to launch the firmware 2.0 update on the exact same day of selling iPhone 3G?  I really appreciate Apple's intention of taking care of existing customers like me.  I feel loved. But to be honest, we already paid one year ago and enjoyed the iPhone for a year.  If we are not in the 3G line, that means we don't have to upgrade to 3G.  Why not just let the old customer wait for one or two days? 

In reality, a stress-tested system gone wild when one of the execution path holding the resource.  When this happens, rebooting server will only solve the problem for a few seconds.  When users see "retry later" message, they will hit retry button more, not less. The longer user waits, the faster they hit the retry button.  This is human nature of dealing with frustration. Apparently you can change the error message to "Do not retry. It's not gonna work", but a more practical solution is to sit with your network admin or static web server admin to figure out a strategy to regulate the traffic.  If you have a controlled user (such as the iPhone case), it would be even better to organize the users.  

Anyway, I picked the right time to active my iPhone (early morning Sunday Eastern time) and I live happily with my iPhone ever after.


No comments: