Microsoft’s AI research team accidentally exposed a large trove of private data on GitHub, including a disk backup of two employees’ workstations. The exposed cache included 38 terabytes of sensitive information such as secrets, private keys, passwords, and more than 30,000 internal Microsoft Teams messages from over 300 Microsoft employees, cloud security startup Wiz has found.
The firm said it discovered a GitHub repository belonging to Microsoft’s AI research division as part of their research into the accidental exposure of cloud-hosted data.
Readers of the GitHub repository, which provided open-source code and AI models for image recognition, were instructed to download the models from an Azure Storage URL. However, this URL granted permissions on the entire storage account, exposing additional sensitive information.
Furthermore, the token was configured to allow “full control” permissions instead of read-only, meaning that an attacker could also delete and overwrite existing files.
“However, it’s important to note this storage account wasn’t directly exposed to the public; in fact, it was a private storage account. The Microsoft developers used an Azure mechanism called “SAS tokens”, which allows you to create a shareable link granting access to an Azure Storage account’s data — while upon inspection, the storage account would still seem completely private,” Wiz researchers noted.
SAS tokens provide a mechanism to restrict access and allow certain clients to connect to specified Azure Storage resources.
As Microsoft explained in a blog post, the leak was a result of Microsoft’s researcher inadvertently including this SAS token in a blob store URL while contributing to open-source AI learning models and providing the URL in a public GitHub repository. The tech giant said it has revoked the SAS token and has taken measures to further harden the SAS token feature.
The company has also assured that no customer data was exposed and no other internal services were affected due to the incident.
Last October, security researchers discovered a misconfigured Azure Blob Storage bucket maintained by Microsoft that exposed 2.4TB of customer data belonging to more than 65,000 companies across 111 countries.