29 December 2008

AT&T has the most screwups of any national carrier!

I love stories like this lovely linkage.
Please allow me to put on my telecom/tech hat for a moment. Stuff like this shouldn't happen unless there's a severe lack of redundancy or a really major disaster, like, oh, Hurricane Katrina.
Big facilities, like the one that lost power in Michigan, usually have battery and generator backups for commercial power outages. These facilities are sometimes unmanned. I'm sure there were relatively few, if any, people at this particular facility at the time of the outage.
That the facility was possibly unmanned at the time of the power outage isn't itself a screwup. That the facility ran on backup power and then lost service is a major screwup.
Explanation: all telecom companies have NOCs, Network Operations Centers. Most carriers have one nationwide, bigger ones, like AT&T, have two. These NOCs monitor alarms for any elements tied to their network. There are some more exotic alarms that my feeble mind can't explain, but the really important alarms are telco and environmental alarms. Telco alarms typically occur if there's a problem with the cable between a cell site and the landline provider. Environmental alarms are things like door intrusions, loss of power, high temp, etc.
These alarms are relayed to the NOC, where someone monitors these alarms and responds accordingly. A power alarm at an individual cell site is a big deal, enough to open a top severity case. A power alarm at the Bloomfield facility, which handles service for a wide geographic area, should not only be ticketed right away, it should be escalated to upper management. Phone calls should be made to make sure reserve power can be maintained until such time as commercial power can be restored. This means hauling in huge batteries, fueling generators, and so forth.
Big facilities like the one in Bloomfield usually have enough batteries and generator fuel to provide power for at least a few hours.
Trust me when I say I'm not just spouting random nonsense. I've seen a facility like the one in Bloomfield firsthand, and I've monitored stuff like this in the past. I know how stuff should be backed up, and I know what the response to something like this should be.
One thing I would like to point is the minimum amount of time that passed between the time the power outage occurred and the time that AT&T's service was affected. According to the article I posted above, Chicago's wireless service was interrupted at about 9:30 a.m. Sunday.
However, this lovely linkage from the Detroit Free Press quotes one woman who noticed her power was off at 5 a.m. Eastern time, which of course is 4 a.m. Central.
Using my mad math skills, that's at least five and a half hours between the time that commercial power was lost and the time that AT&T wireless service was actually affected. That's time that someone should have opened a ticket, called the power company, notified management, and called someone who actually worked at that facility to get their asses over there and make sure the generators were working.
I believe the word that applies here is fail. Maybe even two words: epic fail.
Of course, you won't see any reporters question this. "Oh, power's out, what can you do?" Few people actually know how things work or are even interested in learning remotely how things work.
Here's what I would really like to know. When did AT&T become aware of the power outage? If it was indeed at 9:30 a.m. Central, why didn't they know about it sooner? If they did know about it early Sunday morning, what was the response? When were field technicians engaged? Why did so much time pass between the time of the initial power outage and the time customer service was affected? How does the company plan to avoid situations like this in the future?
But hey, what do I know? I'm no good. That's why they laid my ass off last year. All in the name of Ed Whitacre's $160-million retirement package.
My job. Delivered.

1 comment:

Dan said...

Skynet became self-aware at 2:14am EDT August 29, 1997.