Politico: House e-mails remain down
October 10 2008
OK, Paul Robichaux, you were right -- it was bad admin...
House of Representatives' email, and Internet services on Blackberrys, experienced a severe service outage, which began last night. This failure also has affected employees' access to shared drives and has impacted some websites within the House.gov domain....except that if they had a system, say for example, Lotus Domino, that had active/active clustering available across geographically-distributed server locations, maybe an electrical failure wouldn't be a problem. Ah, well, at least they used to have one.
Computer engineers within the Office of the Chief Administrative Officer have literally been working around the clock to resolve the issue.
Although the House's computer systems have emergency backup capabilities, which are still 100 percent operational, this failure was due to an overloaded circuit breaker in one of our data centers. In other words, this problem was electrical, and not related to the integrity of the servers.
Link: Politico: House e-mails remain down >
Post a Comment
- 2
Henry Ferlauto http://www.geniusinside.com | 10/10/2008 2:17:16 PM
Ed - You can still help IBM's bottom line with this story. Pass it on to the folks who sell the Blade Centers.
"Initial analysis of the situation confirms what systems engineers have suspected for some time: more energy efficient servers are needed within the House’s computing infrastructure," Beard wrote. "By reducing the amount of energy the House’s computing currently demands, and by creating electrical backup systems, I am confident we will greatly diminish outages like this from happening again."
- 3
Darren Duke http://blog.darrenduke.net | 10/10/2008 2:21:30 PM
Not to point out the obvious here (or defend the indefensible) , but even if they did have a system capable of clustering and failover (for instance, like Domino) if you don't have 'net access from your location you are still out. No?
- 4
Charles Robinson http://www.cubert.net | 10/12/2008 5:28:39 PM
It sounds to me like they had clustered servers, but all the servers were on the same grid so when they blew the circuit it didn't matter. Of course, they should be using online UPS's and the primary and failover servers should have separate power.
- 5
Kevin Mort http://www.theglobalmind.com | 10/15/2008 9:39:46 AM
@4 - Exactly. It's a failure in design of the datacenter.
With proper electrical redundancy it is less likely the failure would have occurred.
The idea of more efficient servers noted in the quote @2 is another part of it, but it seems to me they need the electrical redundancy first.




So what you are saying Ed is that even though they had emergency backup capabilities that were 100 percent operational the users still couldn't work? If the users couldn't work how could the emergency backup systems be called 100 percent operational? What about the clustering, log shipping, and other MS availability flavors of the day? Oh wait, I forgot you can't failover to your backup unless???
I have to admit this one gave me a chuckle.