The global Google services outage yesterday was caused by the company's Identity Management System failing after a bug restricted its storage space.
This system failure prevented users from accessing Gmail, YouTube, Google Drive, Google Maps, Google Calendar, and other Google services.
During the outage, users could not send emails via Gmail mobile apps or receive email via POP3 for desktop clients. Also, YouTube visitors were seeing an error message stating, "There was a problem with the server (503) - Tap to retry."
According to a tweet and a Google status report, the outage was caused by the company's automated quota management system reducing the amount of storage available to Google's authentication system.
"Today, at 3.47AM PT Google experienced an authentication system outage for approximately 45 minutes due to an internal storage quota issue. This was resolved at 4:32AM PT, and all services are now restored," Google stated in a tweet from their Google Cloud account.
Google further clarified the cause of the outage in Google Cloud's status page, where they stated the reduced storage caused their identity management system (IdM) to fail.
"Google Cloud Platform and Google Workspace experienced a global outage affecting all services which require Google account authentication for a duration of 50 minutes. The root cause was an issue in our automated quota management system which reduced capacity for Google's central identity management system, causing it to return errors globally. As a result, we couldn’t verify that user requests were authenticated and served errors to our users," Google's status page explains.
An identity management system is used to authenticate users and assign privileges when they log into a system.
If you have ever run out of storage on your computer and found that applications began to crash or no longer worked, this is similar to what happened with Google's IdM.
After running out of storage, Google IdM began returning errors that prevented users from authenticating to Google's services, including Cloud Console, Cloud Storage, BigQuery, Google Kubernetes Engine, Gmail, Calendar, Meet, Docs, Drive, and YouTube.
Google also states that the outages affected their internal users and tools, causing delays in the outage investigation and the reporting of status updates.
To prevent these types of issues from occurring again, Google's automated quota management system has been disabled while they conduct an investigation.