On Monday, Fred Trotter, CEO of a healthcare startup called DocGraph, came into work only to discover that his cloud computing provider, Google, had effectively shut down his company, sending him and his team into a panic.
DocGraph, through its sister company, CareSet, sells Medicare data and analysis to help improve patient care and track the effectiveness of drugs. It not only stores its data with Google, but also relies on Google’s machine learning service, Tensorflow, to help it with the analysis.
Which meant that when Google shut off access to his whole project, not just a single problematic service, he couldn’t simply move his app to another cloud and start serving his customers again.
But by Friday, Trotter was so impressed with Google and how the cloud team ultimately and finally handled the situation that he’s sticking with them, he told Business Insider.
What went wrong
On Thursday, August 18, Google sent Trotter’s company a warning notice in which it had said his Google cloud project appeared to be implicated in some sort of hacking attack (Google said it appeared to be involved in “intrusion attempts against a third-party,”). Google was giving the company three days to knock it off.
The folks at DocGraph were flummoxed by the warning in three ways.
1. Google used its own internal identification number. It was an ID Trotter and team had never heard of. “It took us a considerable amount of time to figure out what they were talking about when they said “625834285688”. We probably lost a day trying to figure out what that meant,” Trotter wrote.
2. All of this was automated. Google said that to fix the problem, the company would need to “request an appeal” by clicking a button on the customer service page. But there was no such button. They sent in a customer service ticket but there was no way for DocGraph to talk to someone live about it.
3. Google didn’t just shut down the server that was implicated in the problems. It suspended the entire account, locking DocGraph out of all their servers, their data, all services.
Last Monday, Trotter discovered Google had suspended his company’s account, which not only took his company’s main service offline, but also prevented his team from accessing his company’s data stored on Google’s cloud.
“This is a pretty substantial problem, since we had committed to leveraging the Google Cloud at DocGraph. We presumed that Google Cloud was as mature and battle tested as other carrier grade cloud providers like Microsoft, Rackspace, and Amazon. But it has just been made painfully clear to us that Google Cloud is not mature at all,” he wrote in a blog post.
Google was listening … on Twitter
Google then shocked him again, in a good way. Within four hours of tweeting, someone from Google had contacted him and had restored access to his project.
Trotter says that, it turns out, he and his team bear some responsibility. They had inadvertently set up a server wrong, exposing a hole, and a hacker was using his company’s to conduct a “denial of service attack,” which is when hackers overload another website or online service with so much traffic, it shuts down.
In other words, Google wasn’t behaving totally irrationally.
“Since then, Google has been doing an extensive post-mortem on the problems and is instituting multiple fixes. In summary, we did totally get stuck in a crack in their automated service model, but once we were actually coordinating with their support team, Google has performed marvelously,” Trotter told us.
Multiple sources have told Business Insider that Google hasn’t yet really won the trust of many companies as the go-to cloud services. It has a reputation for being immature.
In fact, Amazon’s cloud CTO Werner Vogels has been known to throw veiled barbs at Google, telling us earlier this year that “other cloud providers in the market, there’s quite a few of them still sort of in the phase where AWS was five, six years ago — in 2010.”
And later, after the insane success of Pokemon Go experienced widespread outages (Pokemon Go is said to use Google’s cloud), Vogels sub-tweeted “Dear cool folks at @NianticLabs please let us know if there is anything we can do to help! (I wanted that drowzee).”
DocGraph’s story appears to bear witness to Google’s growing pains.
But Trotter says that it also shows just how serious they are about getting this right.
“I do think this is evidence that they are learning the cloud game, they have a lot to learn. But I also think the incident is evidence that they are learning at a furious rate. If they do half of what they told us they would be changing in response to the incident, it means their pace of change, and their willingness to provide solid service, is pretty unmatched,” he tells us.
He adds, “All of this is to say, they stumbled, we stumbled, but I am very pleased with the service overall and will likely remain a customer for a long time. I would much rather have a big problem and see it fixed, then have lots of small problems and never see resolutions.”
A lesson learned
On the other hand, he’s learned a lesson, too: it’s best not to trust one cloud company. He’s turning to Amazon to provide backup storage for his company’s data, using a service called “Amazon Glacier.”
“While [Google’s ] Tensorflow and [Google’s storage service] Nearline make Google clearly the winning cloud service for our company, we cannot tolerate the risk of another ‘world ending’ cloud event,” he tells us.
“That means that we will also be using Amazon Glacier to offset the risks of something like this happening again. While I like what Google offers, there is no excuse for taking the risk of being completely locked out by any single cloud vendor. We will not put ourselves in that position again,” he says.
A Google spokesperson confirms that if Google uncovers any potential abuse situations, it warns the customer, gives them three days to fix it, and will shut down the project if they don’t.
However the company also acknowledged that Trotter’s experience uncovered some problems with its automated system and its working to fix it, telling us:
“While Fred’s experience with our appeal process was uncommon, we want to make sure this situation doesn’t happen again in the future. We’re evaluating how we can make changes to our appeal process and overall customer support, regardless of the support package.”
The spokesperson adds that, once warned, Google works with customers to help them track down the problem and that Google also sells a variety of customer support packages.
Disclosure: Jeff Bezos is an investor in Business Insider through hispersonal investment company Bezos Expeditions.
Business Insider Emails & Alerts
Site highlights each day to your inbox.