Gmail went down for about 20 minutes this afternoon and professionals everywhere were forced to actually be productive (just kidding, they took to Twitter and tweeted jokes about the service’s failure).
— Justin (@JGreenDC) January 24, 2014
In an unfortunate twist, Google’s Site Reliability Engineering Team announced that they scheduled a Reddit AMA literally minutes before its mail client went down.
Did someone say more jokes?
But it was a great opportunity to ask a good question: What’s the protocol for engineers when Gmail does go down?
The full text reads:
Q: Sooo….what’s it like there when a Google service goes down? How much freaking out is done?
A: Very little freaking out actually, we have a well-oiled process for this that all services use – we use thoroughly documented incident management procedures, so people understand their role explicitly and can act very quickly. We also exercise this processes regularly as part of our DiRT testing.
Running regular service-specific drills is also a big part of making sure that once something goes wrong, we’re straight on it.