This last weekend (typically a partially wet Bank Holiday weekend in the UK), British Airways suffered a major IT fault which grounded ALL their flights internationally for most of the weekend!
The root cause(s) of this outage have not been revealed by BT, leaving us to specualte on what did and didn't happen, but from what we have been told, and the subsequent issues faced, we can draw some conclusions.
Firstly, the symptoms of the outage as experienced by BA staff and customers at airports. All BA computer systems went down, meaning there were:
- No phones
- No computers
- No website
- No app / text services
- No bagage handling
- No aircraft tracking
- No logistics monitoring
This meant that no BA staff were able to check in passengers or luggage, were not able to board or depart any flights from the UK (due to bagage and other systems being down), were not able to land flights currently en-route (as the departing flights couldn't depart and so couldn't free up space for the incoming flights), and subsequently were not able to depart any new flights into the UK.
Thus, the entire BA operation ground to a halt. There are reports of BA staff not even being able to book stranded passengers into hotels, or order cabs, as they had no way of doing so. (remember, EU law says all airlines must provide food, acomodation and transfers to any passengers delayed overnight)
So far, BA have said this entire melt-down was caused by a "power spike", and that the fact they "outsourced their IT" to India a couple of years ago had nothing to do with it, as apparently the data centre was in the UK. However, their story doesn't make sense and raises a number of questions:
- Surely their primary data centre had an UPS (Unineruptable Power Supply) in place, which should have protected against the spike?
- Surely their backup / DR / BC data centre is located in a geographically distant location and so wouldn't be affected by the power spike at the primary data centre?
- Surely their backup / DR / BC plans have scenarios for a power issue occuring?
- Why did it take an entire weekend to get systems back on line?
- If "all" IT has been outsourced to India, what local, on-the-ground staff did BA have to fix these issues?