In the world of “online” and hosting, theres a lot kicked around about uptime, SLA, and “the nines”, and no, I’m not talking about a set of characters from a J R Tolkien saga, I’m talking 99.99% uptime, nines.
Having built hosting infrastructures for smes and bluechips alike, running Berkeley IT, and a number of “online related” or more to the point “infrastructure related” companies/projects, this is something I am constantly at a loss trying to educate others on. Thanks to the media (as usual, misinformation and the general media, who would have thought!), Jo Public tends to think that his 99.9% SLA uptime guaranteed by his US based reseller shared hosting provider is where its at.
“My online plate shop is critical to my business, it will never go down with XYZ company, they offer 99.9% SLA”… this is the type of drivle that Mr Jo Public can be overheard spouting, and at which point I tend to want to walk over and bash him over the head with a large banana, repeatedly, preferably one that has been dipped in liquid nitrogen.
Lets clear up a few things.
- Nothing can truly be 100% uptime guaranteed, as something can always go wrong
- A 100% uptime guarantee will cater for “force majoures” though, so dinosaurs coming back to live and rampaging through the datacenter, thats excluded from your guarantee
- 99.999% uptime is achievable, and usual involves multi level redundancy, including physical datacenter diversity – ie: your plate shop being mirrored in a different physical datacenter
- Usually you would need multi level redundancy for this, all the way up the chain, power, network, distribution, data, component – not something thats cheap to deliver – although for infrastructure people it is easy
- 99.9% uptime SLA is usually what most providers would look to offer as standard
- EXPECT TO PAY A PREMIUM FOR 99.999% - if you aren’t paying a premium, and your hoster claims the nines, then its about as likely to be “five nines” as I am to suddenly wake up and find that I have the powers of spider man
- the SLA will usually outline your reimbursement if there is unscheduled downtime, if it doesn’t, move your business elsewhere, as the SLA and uptime claims just aren’t worth anything
- that word in point 7 is important, “unscheduled“, most, if not all providers will exclude “scheduled maintenance” from their uptime guarantee, make sure that your SLA outlines how much notice will be given before this work, and how long work can realistically last for
Now to quantify a few things;
- 99.999% uptime SLA quantifies to 0.4 minutes of down time a month or 5 minutes of downtime per year
- 99.99% uptime, 4 minutes per month, 52 minutes per year
- 99.9% uptime, 43 minutes per month, 8 hours and 46 minutes per year
- anything less, theres no point
Pingdom have actually put a really nice little PDF guide together for this, check it out.

Polymath. Serial entrepreneur turned VC, now sitting on both sides of the table, talking tech, finance, and motorbikes.
Point 3.
Yes achievable but I would have to use a multihomed IP address. A physical failover will be simply too slow.
The problem with multiple IPs is this is that clients don’t like the idea or the complexity of having to manage and run mirrored web architectures. The costs of scaling will be twice/triple or quadruple as much and then there’s the issue of synchronising data between the sand boxed environments.
Point 8 – Love that get out clause
Agreed, I made the assumption that the provider would be multi-homed anyway, perhaps I shouldn’t work on that assumption, after seeing the “hosting infrastructure” of a recent company we were doing due diligence on for a banking client. Everything running off of one leased line, all ecommerce transactions, all for a million TO business, that makes all its money from the web… groovy!